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AVERAGING OF HAMILTONIAN FLOWS WITH 
AN ERGODIC COMPONENT 

By Dmitry Dolgopyat 1 and Leonid Koralov 2 

University of Maryland 

We consider a process on T , which consists of fast motion along 
the stream lines of an incompressible periodic vector field perturbed 
by white noise. It gives rise to a process on the graph naturally asso- 
ciated to the structure of the stream lines of the unperturbed flow. It 
has been shown by Freidlin and Wentzell [Random Perturbations of 
Dynamical Systems, 2nd ed. Springer, New York (1998)] and [Mem. 
Amer. Math. Soc. 109 (1994)] that if the stream function of the flow 
is periodic, then the corresponding process on the graph weakly con- 
verges to a Markov process. We consider the situation where the 
stream function is not periodic, and the flow (when considered on 
the torus) has an ergodic component of positive measure. We show 
that if the rotation number is Diophantine, then the process on the 
graph still converges to a Markov process, which spends a positive 
proportion of time in the vertex corresponding to the ergodic com- 
ponent of the flow. 

1. Introduction. Consider the following stochastic differential equation: 
(1) dX* = -v(Xf) dt + dW t , Xf G T 2 . 

Here, v(x) is an incompressible periodic vector field, Wt is a two-dimensional 
Brownian motion and e is a small parameter. For simplicity of notation, 
assume that the period of v in each of the variables is equal to one and that 
v is infinitely smooth. Let H{x\,X2) be the stream function of the flow, that 
is, 



V L H=(-H' X2 ,H' X1 ) 



v. 
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Since v is periodic, we can write H as 

H(xi,x 2 ) = Ho(xi,x 2 ) + axi + bx 2 



where Hq is periodic. We shall assume that all the critical points of H are 
nondegenerate, and that (a, b) satisfy the following Diophantine condition. 

Let p = a/b be irrational. Without loss of generality, we may assume that 
< a < b (the general case can be obtained by interchanging x\ and x 2 , 
and/or replacing xi by — Xj, if needed). Let [a±, a 2 . . . a n . . .] be the continued 
fraction expansion of p, We assume that 

(2) a n < n 2 for all sufficiently large n. 

We shall see in Section A. 3 that this condition holds for almost all p (with 
respect to the Lebesgue measure on [0, 1]). 

For a and b which are rationally independent, as in our case, it has been 
shown by Arnold in [1] that the structure of the streamlines of v on the torus 
is as follows. There are finitely many domains Uk, k = 1, . . . ,n, bounded by 
the separatrices of H, such that the trajectories of the dynamical system 
Xt = v(Xt) in Uk are either periodic or tend to a point where the vector 
field is equal to zero. The trajectories form one ergodic class outside the 
domains Uk- More precisely, let £ = T 2 \ Cl(Ufc=i C4)- Here, Cl(-) stands for 
the closure of a set. The dynamical system is then ergodic on £ (and is, in 
fact, mixing; see [8]). 

Although H itself is not periodic, we can consider its critical points as 
points on the torus since is periodic. All the maxima and minima of H 
are located inside the domains Uk- 

At first, we shall consider the case where there is just one periodic com- 
ponent U, which contains only one critical point of H (a maximum or a 
minimum). An example of a phase portrait of such a vector field v (con- 
sidered on the plane) is given in Figure 1. The general case is discussed in 
Section A. 4. 

Assume, for definiteness, that the critical point of H inside U is a max- 
imum. We shall denote the saddle point of H on the torus by A and the 
maximum by M. Consider the following mapping of the torus onto the seg- 
ment / = [0, H(M) - H{A)] of the real line: 



We denote the set {x G Cl(U) : H(x) - H(A) = h} by ~f(h). Let 7 = 7(0) = 
dll. Let Lf = a{h)f" + b(h)f be the differential operator on / with the 
coefficients 




if x G £ , 
otherwise. 



(3) 



(h) |Vi? 



1 




7 (h) 



VHldl 
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and 

^ b ^ = Uf Wm" 1 )' 1 1 WW\ dL 

2\J 1 (h)\\JH\ J J 7 ( h )\\/H\ 

Let k = 2(/ 7 | Vi2"| dl)~ 1 Area(£). Consider the process Yt on the segment / 
which is defined via its generator C as follows. The domain of the generator 
D{C) consists of those functions / G C{I) which: 

(a) are twice continuously differentiable in the interior of /; 

(b) have limits ]Sm.h-^>o L f {h) and H^ah-*(H(M)-H(A)) Lf(h) at the end- 
points of /; 

(c) have the limit lim/^o /' (h) and lim/^o /' (h) = fclim^^o Lfih). 

For functions / which satisfy the above three properties, we define Cf = 
Lf in the interior of the segment and as the limit of Lf at the endpoints of 
I. 

It is well known (see, e.g., [11]) that there exists a strong Markov process 
on I with continuous trajectories, with the generator C The measure on 
C([0, oo), I) induced by the process is uniquely defined by the operator and 
the initial distribution of the process. 

We shall prove the following theorem. 

Theorem 1. The measure on C([0, oo),/) induced by the process Yf = 
h(Xf) converges weakly to the measure induced by the process with the gen- 
erator C with the initial distribution h{X^). 

One of the main ingredients of the proof is the estimate of the expectation 
of the time it takes for the solution of (1) to leave the ergodic component. 
This estimate will be derived in Sections 4 and 5. Besides this, we shall use a 
number of estimates on the transition times of the process between different 




Fig. 1. Structure of the stream lines. 
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level sets of the Hamiltonian inside the periodic component. Those will be 
proven in Section 3. First, however, we prove Theorem 1, while assuming 
that we have all the needed estimates. We discuss the case of more than 
one periodic component in Section A. 4. Sections A.1-A.3 contain technical 
estimates needed for the proof. 

We observe that irrationality of p is necessary for Theorem 1 since if p 
is rational, then the restriction of X to £ is periodic rather than ergodic, 
so the phase space of the limiting process is a graph with two edges (one of 
which forms a loop) and one vertex (cf. [7]). On the other hand, it has been 
conjectured by Freidlin [5] that Theorem 1 holds for all irrational values of p. 
Our paper proves this result for p's which cannot be approximated too well 
by rationals. On the other hand, Sowers [14] shows that in the opposite case 
of p's which are very well approximable by rationals, the result is also true. 
There is still a gap between our condition (2) and the numbers considered 
in [14], but we hope that the combination of our approach with the methods 
of [14] will allow the result to be established in full generality. 

We note that there is a related problem, which concerns the asymptotics 
of effective diffusivity for a two-dimensional periodic vector field perturbed 
by small diffusion. It also involves the study of the behavior of the process 
Xf near the saddle points and near the separatrices. We refer the interested 
reader to the papers by Fannjiang and Papanicolaou [4], Koralov [10], Sow- 
ers [15] and Novikov, Papanicolaou and Ryzhik [13] for some of the recent 
results. 



2. Proof of the main theorem. Let ^ be the subset of C(I) which consists 
of all bounded functions which are continuously differentiable on [0, H(M) — 
H(A)) (the derivative at h = is one sided). Note that this is a measure- 
defining set, that is, the equality Jj u dpi = JjU dp2 for all u G ^ implies that 
Pi = P2- Let T> be the subset of D{C) which consists of all the functions / 
for which Cf G \E'. 

We formulate the following lemma. 

Lemma 2.1. For any function f £T>, any initial point x G T 2 and any 
T > 0, we have 



(5) E x 



f{h(X' T ))-f(h{Xl))- Cf{h(Xl))ds 



T 







as e — > 0, 



uniformly in x G T . 



An analogous lemma was used in the monograph of Freidlin and Wentzell 
[7] to justify the convergence of the process Yf to the limiting process on the 
graph. The main idea, roughly speaking, is to use the tightness of the family 
Yf and then to show that the limiting process (along any subsequence) is a 



AVERAGING OF HAMILTONIAN FLOWS WITH AN ERGODIC COMPONENT 5 




Fig. 2. Definition of the stopping times. 



solution of the martingale problem, corresponding to the operator C Here, 
as in [7], the fact that for every a£f and A > 0, the equation Xf — Cf = u 
has a solution / G T> is used. 

The main difference between our case and that of [7] is the presence of 
an ergodic component. However, all the arguments used to prove the main 
theorem based on (5) remain the same. Thus, in referring to Lemma 3.1 of 
[7], it is enough to prove our Lemma 2.1 above. 

The proof of Lemma 2.1 will rely on several other lemmas. Below, we 
shall introduce a number of processes, stopping times and sets, which will 
depend on e. However, we shall not always incorporate this dependence on 
e into notation, so one must be careful to distinguish between the objects 
which do not depend on e and those which do. 

Let T be the first time when the process Xf reaches the set -y(e 1 ^ 2 ). We 
will need the following estimate on the expectation of r , which is proved in 
Sections 4 and 5. 

Lemma 2.2. For any x > 0, there exists some Eq > such that E^r < 
E i/2-x f or £ < £Qf j or aU x G C1 (£)_ 

Let us now choose constants x and a, such that 0<x<j<a<^. Let 
7 = 7(e a ). Recall that 7 = 7(0) is the boundary of U. Let r be the first time 
when the process Xf reaches 7 and a be the first time when the process 
reaches 7. We inductively define the following two sequences of stopping 
times. Let 77 = r. For n > 1, let a n be the first time following r n when the 
process reaches 7. For n > 2, let r n be the first time following cr n _i when 
the process reaches 7. See Figure 2. 
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We consider the discrete-time Markov chains £^ = X!j. and £^ = with 
the state spaces 7 and 7, respectively. Let P\(x,dy) and P2(x,dy) be tran- 
sition operators for the Markov chains and respectively. They are 
uniformly exponentially mixing in the following sense. 

Lemma 2.3. There exist constants < c < 1, eq > 0, no > and proba- 
bility measures [i and v (which depend on e) on^f and 7, respectively, such 
that for e <e$ and u>uq, we have 

sup(Var(Pf (x,dy) - n(dy))) < c n , 

(6) 

sup(Var(P™(z,dy) - v(dy))) < c n , 

where Var is the total variation of the signed measure. 

We prove this lemma in Section A. 2. Let us now examine the transition 
times between 7 and 7, assuming that we start with the invariant measures. 
The following lemma is proved in Section 3. 

Lemma 2.4. The asymptotic behavior of the transition times is the fol- 
lowing: 

(7) E tl a = k 1 e a (l + o(l)) ase^O, 

(8) E u T = k 2 £ a (l + o(l)) ase^O, 

where k x = 2(/ |ViT| dl)~ l Area(J7) and k 2 = 2(/ \VH\ dl)' 1 Area(£). Fur- 
ther, we have the following estimate: 

supE x r < k z e a ~ K ase^O 
XG7 

for some > 0. 

We will need to control the number of excursions between 7 and 7 before 
time T. For this purpose, we formulate the following lemma, which will be 
proven in Section 3. 

Lemma 2.5. There is a constant r > such that for all sufficiently small 
e, we have 

supE x e" CT <l-re Q . 

xS7 
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Using the Markov property of the process and Lemma 2.5, we get the 
estimate 



(9) sup E x e- an = supE^e - * 7 " < supE x e - " < (1 - re a ) n . 

x&T 2 x€~ \x£- J 

The next lemma (also proved in Section 3) allows us to estimate integrals 
of the type (5) over intervals [0, r] and [0, a]. 



Lemma 2.6. For any function f 6D, we have the following asymptotic 
estimates: 



(10) 



(11) 



sup 

xS7 



E,. 



E, 



f{h{xi)) - f(h(x s )) - / cf(h(xi)) ds 

o 



f(h(X £ T ))-f(h(X* ))- f T Cf(h{Xt))ds 

Jo 



Moreover, we also have 



(12) 



sup 

xGT 2 



E., 



f(h(Xt))-f(h(XI))- [ Cf{h(Xl))ds 

Jo 



as e — > 0, 



as e — > 0. 



as £ — > 0. 



Proof of Lemma 2.1. Let / G V, T > and > be fixed. We would 
like to show that the absolute value of the left-hand side of (5) is less than 
rj for all sufficiently small positive e. 

First, we replace the time interval [0,T] by a larger one, [0,r], where f is 
the first of the stopping times r n which is greater than or equal to T, that 
is, 



r = mm r n . 

n:T n >T 



Using the Markov property of the process, the difference can be rewritten 
as follows: 



E, 



f{h(X£)) - f(h(X*)) - / Cf{h{XD) ds 



E : , 



f(h(X s T ))-f(h(X* ))- f Cf{h(Xt))ds 

Jo 



E T E 



f(h(Xt))-f(h(XZ))- Cf(h(Xl))ds 

o 
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The latter expression can be made smaller than j? for all sufficiently small 
e due to (12). Therefore, it remains to show that 



E, 



f(h(XZ))-f(h(X* ))- Cf(h{Xt))ds 



< 



4r? 



for all sufficiently small e. We shall denote the indicator function of a set A 
by xa- Using the stopping times r n and a n , we can rewrite the expectation 
in the left-hand side of this inequality as follows: 



E 



f(h(Xf))-f(h(X* ))- f Cf{h(Xl))ds 

Jo 

f(h(X*)) - f(h(X* )) - f T Cf(h(Xf)) ds 



E, 



n=l V 

00 , 

+ E E *U{-<^ 



j-(h(Xl))-f(h(Xl))-\ Cf(h(XZ))ds 



f(KK)) - f(H x o)) - / Cf(h(xi))ds 



n=l 

provided that the sums in the right-hand side converge absolutely (which 
follows from the arguments below). Due to (12), the absolute value of the 
first term on the right-hand side of this equality can be made smaller than 
for all sufficiently small e. Therefore, it remains to estimate the two infinite 
sums. 

Let us start with the first sum. Note that 



E, 



X{ Tn< r} <^xX{a n -!<T} 



<E, 



i>e- T } 



< e T (l - re 



a\n— 1 



The last inequality here is due to (9) and Chebyshev's inequality. Taking 
the sum in n, we obtain 

00 00 



n=l 



n=l 



where the constant K depends on T and r. By Lemma 2.6, we can find Eq 
such that for all e < Eq, we have 



sup 

XG7 



E, 



f(h(X^))-f(h(X s ))- Cf(h(Xl))ds 



< 



TjE 

5K 



Therefore, for e < eq, we have 



n=l 



f{h{xi))-f{h{xi)) 



Cf{h{Xt))ds 
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< sup 

zG7 

<5. 



Cf{h{Xl))ds 



E E ^{r„<7} 



n=l 



The same argument allows us to write the estimate as 



71=1 



f(h(X*))-f(h(X° )) 



Cf{h{Xt))ds 



< 



The left-hand side of this inequality, however, is slightly different from the 
desired expression, since (11) only allows us to estimate the expectation 
with respect to the original distribution u, rather than the supremum over 
all possible initial points. Thus, we need to estimate the difference 



71=1 



EM*W<r}^ r 



71=1 



><r} 



f(h(X*))-f(h(X £ )) 



f(h(x e T )) - Mxs)) 



Cf{h{Xl))ds 
Cf(h(X e a ))da 



This expression can be estimated from above by 



sup 

xG7 



E, 



f{h(Xt))-f(h(Xt)) 



o 



Cf(h(Xl))ds 



x sup(Var(Pr 1 (^>%) " "(*))), 

77=1 X£ T 



which is smaller than ^ for all sufficiently small e, due to (6) and (12). 
Combining the above estimates, we see that the absolute value of the left- 
hand side of (5) is less than r/ for all sufficiently small positive e. This 
completes the proof of the theorem. □ 



3. Behavior inside the periodic component. In this section, we prove 
Lemmas 2.4, 2.5 and 2.6. At several points, we shall use Lemma 2.2, which 
is proved in Section 4 and 5. This involves no circular reasoning since the 
results of this section are not used in Sections 4 and 5. 

We shall need a number of statements from [7] and [10] which describe 
the limiting behavior of the process Xf with the initial point x £U, both in 
the case when x is fixed, and when x is asymptotically close (as a power of 
e) to the boundary of U. 

It has been shown in [7] (Theorem 2.2) that if Xf = x G U is fixed, then the 
process h{Xf), stopped at t = a, converges weakly to the diffusion process on 
/ with the generator L, which starts at h(x) and is stopped at the moment 
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when it reaches zero (the left endpoint of I). We formulate the following 
statements, which easily follow from the proof of Theorem 2.2 of [7], as a 
separate lemma. 



Lemma 3.1. (a) For any function f 6 X>, we have 
lim sup E 



f(h(X*))-f(h(X E ))- I Cf(h(X!))ds 





0. 



(b) For any h> 0, there is a constant c(h) > such that 
lim sup E x e _cr < 1 — c(h). 



We will also need the following lemma, which gives us the asymptotics of 
the time the process needs in order to exit the periodic component if the 
original point is asymptotically close to the boundary. It was proven in [10] 
(Lemma 4.4). 

Lemma 3.2. There is a constant k\ such that for any j < a < ^, we 
have 

lim sup 

£ ^ xG7(e Q ) 



Proof of Lemma 2.4. The first part of the lemma follows from Lemma 
3.2. Namely, it states that there exists a constant k\ such that 



Let us recall how to identify the constant k\ (rigorous arguments can be 
found in [10]). If x G U did not depend on e, then the asymptotics of E x cr 
could be obtained using the results of [7]. Namely, recall the definition of the 
differential operator L from Section 1 and let u(h) be the bounded solution 
of the ordinary differential equation 

(14) Lu = a(h)u" + b(h)u' = -1, /iGlnt(I), 

with the boundary condition u(0) = 0. Such a solution exists and is unique 
(see, e.g., [7]). It is equal to the expectation of the time it takes for the 
limiting process, starting at h, to reach the endpoint of / corresponding to 
the boundary of the periodic component. It was demonstrated in [7] (Lemma 
2.3) that lim £ ^o E x cr = u(h(x)). In particular, 

lim E x a = u'(0)/i(x) + o(h(x)) as h(x) ->■ 0. 

e— >0 



(13) 



lim sup 



1 x'Pt 
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Formula (13) is the corresponding asymptotic formula in the case where 
h(x) is a function of e, that is, h(x) = e a . In particular, k\ = u'(0). Equation 
(14) can be solved explicitly using the expressions for the coefficients a(h) 
and b(h). We obtain that 

u<(0) = \VH\ *y l l a) dh = 2(1 \VH\ a y Area(I/), 

which proves (7). 

In order to study the asymptotics of E u t, we shall use the asymptotics of 
E^er and the fact that Xf is an ergodic process on T 2 . Let V s = {x £ U : < 
h{x) <e a }. 

The process Xf is ergodic and the invariant measure is the Lebesgue 
measure on T 2 . Applying the Birkhoff ergodic theorem to the process with 
the initial distribution v, we obtain that 

fnX£(Xf)dt 
lim — = Areafc J almost surely, 

n— >oo fi 

where X£ is the indicator function of the set E. Also, from the Birkhoff 
ergodic theorem, we get 

lim — = Ej,0"i = E u t + End almost surely. 

n— >oo fi ^ 

Using the Birkhoff ergodic theorem again, we can write 



n 



lim JU ^ y l ' = lim JU ^ K 11 = lim JU ^ v 11 lim 

n— >oo n n— >oo (j n \n—>oo n J \n^oo (j n 

_E l/ j^ X £(X!)dt < E„t 



E v t + E^a ~ E u t + E M a ' 
where the equalities hold almost surely. Therefore, 

E„r 



Area(£) < 



E V T + E M cr 

In exactly the same way, we can prove that 

Area(£ U V e ) > - E ^ , Aiea(U\V E ) < 



E^ + E^a v 1 ~~ E u t + E^a 

and 

E^o" 



Area([/) > 



E V T + E M cr 

Combining these four estimates, we obtain that 

Area(£) < E u t < Area(£ U V £ 



Area(W) ~ E^cr ~ Aiea(U\V £ ) 
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Since lim e _>o Aiea(V £ ) = 0, we obtain 

(15) Hm E^ = Area^) 
v ; e^oE^cr Area(^) 

Therefore, (7) implies (8). It remains to prove the last statement of the 
lemma. 

Let 71 = {x:h(x) =e 1 ^ 2 }. We inductively define the following two se- 
quences of stopping times. Let f\ = r be the first time when the process 
Xf reaches the set 71 . For n > 1 , let a n be the first time following T n when 
the process reaches 7. For n > 2, let r n be the first time following cF n _i when 
the process reaches 71. 

First, let us estimate the probability of the event that the process which 
starts at x G 71 reaches 71 before reaching 7. Lemma 4.3 of [10] states that 
there is a constant c\ such that for any x G 7, 

|ff , x(r<a)-e 1/2 ^| <cie a |lne|. 

Since a > I, this implies that P x (t > a) < 1 — \e 1 l 2 ~ a for all sufficiently 
small e, for all x G 71. Using the Markov property of the process, we conclude 
that 

su P P x .(r>a n )<(l-i e 1 / 2 - a ) n . 

We also need to estimate how much time it takes for the process which starts 
at x G 71 to leave V £ (the region between 7 and 7). Lemma 4.2 of [10] states 
that there is a constant C2 such that for any x G V e , 

E x min(r, a) < C2£ 2a \ lne|. 

Since 2a > 1/2 — x, the right-hand side of this inequality is smaller than 
£ i/2-x £ Qr a vj sufficiently small e. Therefore, by Lemma 2.2, 

supE x min(r, a±) < supE^r + supE x min(r, a) <2e 1 ^ 2 ~ x . 

XG7 XG7 £€7 

Finally, due to the Markov property of the process, 



supE x r = sup 

XG7 x£7 



sup 

£67 



E x min(r, a\) + ^ (E x min(r, cr n+ i) - E x . min(r, cr n )) 

n=l 

00 

E^min^CTi) + ^ E x (x{ T >a„}IEx£ min(r,ai)) 



n=l 



< 26 1 / 2 — ^ (1 - \e l l 2 ~ a ) n = 4e a ~*. 

n=0 

This completes the proof of the lemma. □ 
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Proof of Lemma 2.5. For a fixed h, let r be the first time when 
the process Xf reaches the set j(h). The probability of the event that the 
process which starts at x £ 7 reaches 7(h) before reaching 7 is estimated in 
the proof of Lemma 4.4 of [10]. Namely, from formula (35) of [10], it follows 
that there is a positive h such that 

e a 

M¥Jf<a) > 

2h 

for all e which are sufficiently small. Let us now fix a value of h for which 
this inequality holds and examine the process Xf which starts at x € j(h). 
By the second part of Lemma 3.1, there is a constant c(h) > such that 

sup E x e~ a < 1 - c(h) 

x£f(h) 

for all sufficiently small e. Due to the Markov property of the process, 
supE :r e" fT <l-infP x (r<cj) + infP :); (r<cj) sup E^e - ' 7 

x . g - 267 XS7 



< 1 



c(h)e c 
2h 



re 



This completes the proof of the lemma. □ 

Proof of Lemma 2.6. For hi < h 2 , we denote the set {x £ U:h\ < 
h(x) < /12} by U(h\,h2). Let us take numbers r\,T2 G (0,H(x) — H(A)), 
which do not depend on e, such that r\ < Take numbers a' < a" such 
that a < a' < a" < \. Let 7' = j(e a ) and 7" = j(e a ). Let a be the first time 
when the process Xf reaches 7(7*1) or 7', whichever happens first. Similarly, 
let <7 be the first time when the process Xf reaches 7(7-2) or 7", whichever 
happens first. For x £ 7, using the Markov property of the process, we can 
write 



E, 



f(h(xz))-f(h(xt)) 



£f(h(X!))ds 



f(h(X§)) - f(h(X^)) - / Cf(h(XD) ds 



(16) 



+ E x fx{/i(X|)=n} 

/(M^))-/(Mx £ )) f cf{h(xi))ds 

Jo 

+ ^x[X{h(XL)=e a '} 



x E 
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x E 



f{h{Xl))-f{h{X%))- Cf(h(XI))ds 



The probability sup z67l F x (h(X§) = r\) is estimated from above by c{r\)e a 
for some constant c(ri) and all sufficiently small e (as follows from for- 
mula (35) of [10]). By the first part of Lemma 3.1, 



lim sup E a 

£_>0 3;e7(r 1 ) 



f{h(X%)) - f{h{X s Q )) - / Cf(h(XD) ds 



0. 



Therefore, the second term in the right-hand side of (16) is of order o{e a ). 
We also note that 



f{h{xi)) - f(h(x e )) 



Cf{h{X £ s ))ds 



sup E x 



< c \f(e a ) - /(0)| + sup \Lf\ sup E x a . 

\ 267' / 

From Lemma 3.2 it follows that sup xe y as e — > 0. 

Therefore, the third term on the right-hand side of (16) is also of order 
o{e a ). In order to prove (10), it remains to show that 



(17) 



sup 

267 



E,. 



f(h(x§)) - /(M* E )) - / Af(M*D) ds 



o{e a ). 



Let xt be the deterministic process 



dxt 



1 



v(xt) dt 



and let T = T(x) be the time it takes the process xt, starting at x, to make 
one rotation along the level set, T{x) = inf{t > : x% = x}. 

We shall need several facts about the behavior of the processes xt and Xf 
with an initial point in U (e a , ri) . Let us formulate them here as a separate 
lemma and then continue with the proof of Lemma 2.6. 



Lemma 3.3. (a) There are positive constants c\ and C2 such that c\e< 
T{x) < C2e\ lne| for all sufficiently small e and all x £ U(e a ,r±). 

(b) For any 5' > 0, R > and all sufficiently small e, we have 

F x ( sup \h{X e t ) - h{x t )\ > e 1/2 - 5 ') < e R for all x £ U(e a ' ,n). 

\t<T(x) J 

(c) For any 5' > 0, R > and all sufficiently small e, we have 
F x ( sup \Xf -x t \ >e 1/2 - a '~ s ') <e R for all x G U \e a ' , n) . 
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The first statement immediately follows from the fact that the critical 
points of H are nondegenerate. The parts (b) and (c) are proved in the same 
way as the corresponding statements in Lemma 5.3 below [see formulas (58) 
and (57)]. Let us proceed with the proof of Lemma 2.6. 

We inductively define a sequence of stopping times T n as follows: 



r = o, 



Ti = min(<7, T(x)), T n+1 = T n + Ii(XfJ. 



Let replace the time interval [0, a) by a larger one, [0, a], where a is the first 
of the stopping times T n which is greater than or equal to a, that is, 



a = min T n . 

n:T n >a 



We would like to replace ff by 5 in (17). Note that a — a < C2e\ lne| by part 
(a) of Lemma 3.3. Parts (a) and (b) of Lemma 3.3 easily imply a statement 
which is slightly stronger than part (b) of the lemma. Namely, for any 8' > 0, 
R > and all sufficiently small e, we have 



sup \h(Xf) - h(x t )\ > e 

t<C2e\ lne| 



1/2-5' 



< e 



R 



for all x£ U(e a ,n). 



Therefore, 



Buj>\E x [f(h(X§))-f(h(XZ))]\ 

xG7 



sup 

x€7 



E,E X£ sup [f(h(X £ ))-f(h(Xm 

CT 0<t<c 2 e|ln£| 



Also, by part (a) of Lemma 3.3, 



sup 

xS7 



E, 



o 



Cf(h(XI))ds- / Cf(h(XI))ds 



o 



< sup \Lf\ ■ supE x (cr — o) = o(e a ). 

x€7 

Therefore, (17) will follow if we show that 



(18) 
Let 

(19) R(e 



sup 

XG7 



IE,- 



Cf(h(Xl))ds 



sup 

xeU(e a ',ri) 



\E x [f(h(X^))-f(h(X$))-fp Cf(h(XD) ds}\ 
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We shall prove that lim e ^Q R(e) = 0. Then, due to the Markov property of 
the process, 



sup 

XG7 



E, 



f(h(XZ))-f(h(XZ))- Cf(h(X e 8 ))ds 



< sup 



E, 



x \ *{T n «j} 



x E 



X', 



f(h(X^))-f(h(X* ))- Cf{h(Xt))ds 



Ti 



< R(e) sup J2 E *(X{T„<a} E ^ Ti) = R(e) supE^a. 

xG7 n=0 xe 7 

By Lemma 3.2, we have that sup x ^¥, x a = 0(e a ), and (18) will follow if we 
show that lim e _ > o-^( e ) = 0. From Lemma 3.3, part (b), it follows that 



(20) 



T(x) 

hm sup — — = 1. 



ier/(£ a ',ri) 



Let us study the expression in the numerator of the right-hand side of (19). 
By Ito's formula, 



E, 



f(h(x^))-f(h(x*))- f Tl cf(h(x!)) 

Jo 



ds 



(21) 



E, 



(\Vh(XI)\*f(h(XI)) + Ah(X!)f'(h(X!)))ds 
Tl (a(h(XI))f"(h(XI)) + b(h(XI))f(h(X^)))ds 



From the definitions of the coefficients a(h) and b(h), it follows that 
a(h(x)) = S ?™*f d8 and b{h{ X )) = ^ ^ s) dS 



2T(x) 



2T(x) 



We shall need the following estimates on the behavior of the coefficients a 
and b and the derivatives of the function / near zero. 

Lemma 3.4. The asymptotic behavior of a, b and f" is as follows. 
(a) There are positive constants c\ and C2 such that 
ci\\nh\~ l < a(h) < c 2 |ln/i|~ 1 
for all sufficiently small h. Moreover, a'(h) = o(j-) as ft. — > 0. 
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(b) There are positive constants C3 and C4 such that 

C3 < b(h) < C4 

for all sufficiently small h. Moreover, b'{h) = o(j-) as h — > 0. 

(c) There is a positive constant C5 such that 

\f"{h)\<c b \\nh\ 

for all sufficiently small h. Moreover, \f"'(h)\ = o( > ) as h— >0. 

The estimates of a(h) and b{h) and the asymptotics for their derivatives 
follow from the proof of Lemma 4.5 of [10]. The estimates of f"(h) and 
f"'(h) are due to the estimates of a, b and their derivatives, and to the fact 
that a{h)f"{h) + b{h)f'{h) £ ^. Let us proceed with the proof of Lemma 2.6. 

We would like to add and then subtract the expression Cf(h(x))E x Ti 
from the right-hand side of (21). First, however, we transform it as follows: 

£/(/ l (x))E a; T 1 

= E X I \a(h{x s ))f"{h(x s )) + b{h(x s ))f'{h(x s )))ds 
Jo 

= E x ( T (a{h(x s ))f"(h(x s )) + b(h(x s ))f'(h(x s ))) ds + T(x)Vi(e, x) 
Jo 

= ±E Z f T (\Vh(x s )\ 2 f"(h(x s )) + Ah(x s )f'(x s )) ds + T{x)ih{e, x) 
Jo 

= ±E a [ Tl (\Vh(x s )\ 2 f"(h(x s )) + Ah(x s )f'(x s ))ds +T(x)M*,x)- 
Jo 

Here, ip\ and ip2 ar e such that 

lim sup |^(e,a;)| = 0, i = l,2. 

e ~ t0 x£U(e a ',r 1 ) 

The second and the fourth equalities above are due to Lemma 3.3, part (b) 
[which implies that F X (T 7^ T\) < e R ], Lemma 3.3, part (a) [which bounds 
T(x) from below] and Lemma 3.4 [which bounds the integrand from above]. 
Thus, we can rewrite the right-hand side of (21) as 

\E X [ Tl (\Vh(Xl)\ 2 f(h(XD) - \Vh(x s )\ 2 f"(h(x s )))ds 
Jo 

+ iE x f 1 (Ah(XI)f(h(XD) - Ah(x s )f'(h(x s )))ds 
Jo 

+ E X [ T \a(h(x s ))f"(h(x s ))-a(h(X £ s ))f"(h(X!)))ds 
Jo 

+ E X f T \b(h(x s ))f'(h(x s ))-b(h(Xl))f'(h(Xl)))ds-T(x)Me,x). 
Jo 
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Due to Lemmas 3.3 and 3.4, the absolute value of expectation of each of the 
four integrals above is estimated by an expression of the form T(x)ip(e,x) 
such that 



lim sup \ip{s, x)\ =0. 



£^0 



xGf/(e a ',ri) 



This, together with (20), implies that lim e ^o-R( £ ) = 0. This completes the 
proof of (10). 

Let us now prove (11). Let us denote the one-sided derivative of f(h) at 
/i = by /' (0). Then, 



and 



f(h) = f(0) + f'(0)h + o(h) 



as h — > 



Cf(h) = -f(0) + o(l) as/wO, 

where k is the same as in the definition of the operator C. Therefore, we can 
estimate the left-hand side of (11) as follows: 



f(h(xz))-f(h(x%)) 



£f(h(Xl))ds 



= f'(0)e a + o(e a ) - -/'(0)E„t + o(l)E w r = o(e a ) ase^O. 
k 

Here, we used the facts that < h(Xf) < e a for < s < r, that E^r = 
&2£ a (l + o(l)) as e — ► 0, where k2 is the same as in Lemma 2.4, and that 
&2 = k. It remains to prove the last statement of Lemma 2.6. 
From Lemma 3.1, part (a), it follows that 



sup 



E, 



£f(h(X s s ))ds 







as e — > 0. 



Using Lemma 2.2 and arguments similar to those used in the proof of Lemma 
2.4, it is easily seen that 



Therefore, 



sup 

xesuv £ 



E, 



sup K x t -^0 as e — ► 0. 



f{h{xi)) - f(h(x*)) - f cf{h(xi)) ds 

Jo 



as £ — > 0. 



This completes the proof of Lemma 2.6. □ 
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4. Estimate on the time of exit from the ergodic component. Instead 
of Xf, it will be convenient to consider the same process slowed down by a 
factor of e. Thus, we define the following process: 

(22) dXf = v (XI )dt + ^EdW t , Xf e T 2 . 

The first time when the process Xf reaches the set 7(e 1 / 2 ) will be denoted 
by r. In the new notation, Lemma 2.2 can be reformulated as follows. 

Lemma 4.1. For any x > 0, there exists some Eq > such that E x r < 
e -i/2-x f or e < £0; f or a n x e C1 (£) 

The proof of Lemma 4.1 will rely on a number of technical lemmas. We 
shall state all of these below and will provide their proofs in the next section. 
Let us first, however, give an intuitive explanation of the result. If the dis- 
tance from the original point to the periodic component is strictly positive, 
then it will take time of order e~ l for the process yfedWt to reach the set 
7(e 1 / 2 ). What helps us to get an extra factor e 1 / 2- ^ is the presence of the 
ergodic flow v in the right-hand side of (22). To explain our strategy, take 
a curve V transversal to the unperturbed flow y' t = v(yt). Then, (22) is a 
small perturbation of v , so near each crossing of T by the deterministic or- 
bit, there is a crossing by the orbit of (22) (in fact, there are infinitely many 
random crossings near each deterministic one, but we consider the first of 
them). Let B be a point on T which is carried into the saddle point by the 
unperturbed flow. We want to see how long it takes for the random orbit to 
get inside the y^-neighborhood of B, since then, the noise will help to get 
inside the periodic component. Let Y n denote the consecutive crossings of V 
by the deterministic orbit. Then, the average distance between {Y n }^ =1 is 
1/N, so we expect to find one of the points Y n within 0(1/ N) from B. Next, 
the distance between the deterministic and random orbits is asymptotically 
Gaussian with variance \/Ne. The above two quantities are of the same 
order if 1/N = VNe, that is, N = e -1 / 3 . The probability that a Gaussian 
random variable with variance e 1//3 hits a segment of length yfe is of order 
y/e/e 1 ^ = £ 1//6 . Hence, we expect that the probability of not reaching the 
e 1 / 2 -neighborhood of B after e _1//3 crossings does not exceed 1 — ce 1//6 , so 
for n ^> e -1 / 3 , the probability that the random orbit stays inside the ergodic 
component for n crossings does not exceed (1 — ce 1 / 6 ) nel/ ' 3 . This quantity 
tends to zero if we consider n ^ e" 1 / 2 , and Lemma 4.1 follows from the 
Markov property of the process. 

In the rest of this section, we provide the precise version of the heuristic 
arguments given above. For various reasons (one of the reasons being that 
we want to cover a set of rotation numbers of full measure), the actual 
statements will be slightly worse than the above heuristics, so the result we 
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get will differ from E(r) ~ £ _1//2 , which could be expected if our heuristic 
arguments were literally correct, by a factor where x can be taken 
arbitrarily small. 

Let us now proceed with the rigorous arguments. The main result of [1] 
implies that there is a smooth closed curve Y on the torus which satisfies 
the following properties. 

(a) r is homeomorphic to a circle and lies in the interior of £. 

(b) r is transversal to the vector field v at every point of T. 

(c) Let yt be the solution of the equation y' t = v(y t ) with the initial 
data yo = x £ £. Thus, yt follows the orbit of the unperturbed flow. We 
denote by a the first positive time when yt reaches the set V, that is, 
a = a(x) = inf{t > 0:y t G V}. 

There is a unique point B £Y such that <j{B) = oo. For the solution of 
the equation y' t = v(yt) with the initial data yo = B, we have lim^oo yt = A, 
where A is the saddle point on the torus. In other words, the flow line which 
starts at B enters the saddle point before reaching T. For the rest of the 
points x £ r, the time <r(x) is finite. 

(d) We can define a return map / : r\{£>} — > T as follows. For x S r\{2?}, 
we take the solution y t of the equation y[ = v(yt) with the initial data yo = x. 
f(x) is then defined to be equal to y a t x y 

We can choose coordinates 9 : Y — ► S 1 = [0, 1) such that 6(B) = and 
the return map is conjugate to the rotation by angle p, that is, 9(f(x)) = 
0(x) + /^(mod 1). In fact, dO = k dH on Tf]D, where A: is a positive constant 
(H itself is multivalued on the torus, but dH = —H' X2 dx\ + H' dxi is well 
defined) . For the sake of simplicity of notation and without loss of generality, 
we shall assume that k = 1. 

Without loss of generality, we may assume that T is perpendicular to the 
vector field v in some neighborhood of B (this will be convenient in the 
proof of Lemma 4.2). 

Let d(x,y) be the metric on T equal to the distance on the circle between 
0(x) and 9(y). By considering the image of Y under the action of the flow 
—v(x) for a sufficiently small time, we can obtain another curve Y' which 
satisfies the same properties as Y and does not intersect it (see Figure 3). 
We introduce a sequence of stopping times r n , where tq = and r n+ i is the 
first time, following r n , when the process Xf reaches Y after first visiting Y' , 
that is, 

T n+ i = inf {t : X £ t G Y, X £ s G Y' for some r„ < s < t}. 

t>T n 

Let T n = min(r, r n ). We can define a Markov chain with the state space Y 
as follows. Let Xq = x and X n = X £ n , where Xf is the solution of equation 
(22) starting at Xq = x. Then, X\,X2, ■ ■ ■ is a Markov chain on Y (or, if 
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Fig. 3. Flow on one period. 



x G r, we can start the Markov chain with Xq). We would like to study the 
successive returns of this chain to the interval I e = {x G T : d(x, B) < \/e}- We 
are interested in this because we have the following lemma (see Section A.l). 

Lemma 4.2. There exist c > and so > such that ¥ x (t < n) > c for 
£ < £o > /or x £ I £ . 

It is easily seen that instead of considering all initial points x G £ in 
Lemma 4.1, it is enough to consider x G L. Indeed, we have the following 
lemma (see Section A.l). 

Lemma 4.3. There exist c > and eq > suc/i i/tai E x ri < c|lne| for 
£ < £0; /or a// rr G Cl(<? U U e i/i), where U £ i/2 is the part of the periodic com- 
ponent bounded by 7 and ^(e 1 ^ 2 ). 

Let tj be the first of the stopping times r n when X n G I e , that is, 77 = 
min n: x n el e { T n}- We also define 77 = min(r, 77). We shall prove the following 
lemma. 

Lemma 4.4. For any x > 0, there exists some £0 > such that K x fi < 
g -i/2-x y or £ < £0; j or a ^ x £Y. 

Lemmas 4.2-4.4, together with the Markov property of the process Xf, 
easily imply Lemma 4.1. Indeed, let A n be the event that the orbit reaches 
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7(e 1//2 ) for the first time after exactly n returns to I e and let XA„ be the 
indicator function of this event. Then, sup a , g j e (l — ¥ x (Aq)) < (1 — c\) and 
su Pxg/ ^x{An) < (1 — c i) n 5 where c\ is the constant from Lemma 4.2. Prom 
the relation 

ti sup P x /( J 4 n _i) + sup E x i(rxA n -i) 



sup E x (txa„ ) < sup ( ( 1 - xa ) 



it follows by induction on n that sup^g^ E, x (txa„) < + l)e *(1 — 
ci) n_1 . Therefore, 

oo 

sup E x t < sup E x ti + supE x f/ + V sup K x (rxA n ) 
xec\(e) xeci{£) xer n=0 xei £ 

oo 

< C 2 \ \ns\ + £-V2-* + £ ( n + l) £ -l/2-^ (1 _ Cl )«-1 
n=0 

where C2 is the constant from Lemma 4.3 and C3 is some other constant. 
This proves Lemma 4.1. 

Let us reduce Lemma 4.4 to a number of simpler statements. Let us 
consider the deterministic process Y n on the state space T, where Yq = x 
is the initial point and Y n+ i = f(Y n ). Thus, in ^-coordinates, it is simply a 
rotation by angle p. Note that f(Y n ) is not defined if Y n = B, but we can 
define Y n +i in this case using the ^-coordinates. Define 

N(x) = mm{n : jne > d(Y n , B)}, 

n>0 

where Yq = x. 

We shall need the following lemma about the process Y n (see Section A. 3) 
(basically, this is a statement about circle rotations by the angle p). 

Lemma 4.5. For any 5 > 0, there exits some Eq > such that N(x) < 
£ -i/3-<5 j or a ^ £ <- j or a n x g 

Let us now show that Lemma 4.4 is implied by the following result proved 
in Section 5. 

Lemma 4.6. For any 5 > 0, there exists some Eq > such that 

F x (x N{x) e i £ ) > e 1 ' &+& 

for all £ < Eq, for all x £ T. 
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Take an arbitrary 5 > 0. From Lemmas 4.5 and 4.6, it follows that 
F x (X n i I e for all n < e^ 28 ) < (1 - ^f-^~ s \. 
Therefore, for all sufficiently small e and all x £ T, we have 

(23) F x (X n $ I e for all n < e~ l/2 ' 28 ) < \. 

From Lemma 4.3 and the Markov property of the process, it follows that 
for some c\,C2 > and all sufficiently small e, we have 

P x (n >r|lne|) <cie _C2r , r > 1, 

for all x £T. Therefore, 

(24) IP,(r [£ - 1 /^]>e" 1/2 " 35 )<i 

for all sufficiently small e for all x £ T. Combining formulas (23) and (24), 
we obtain 

iMr^- 1 / 2 - 35 )^! 

for all sufficiently small e for all x € T. With the help of Lemma 4.3, it is 
easily seen that this, in fact, holds for all x S Cl(£U?7 e i/2). Using the Markov 
property of the process, we can now conclude that ~E x tj < /ce" 1 / 2 " 3 " 5 for some 
constant k, for all sufficiently small e, for all x G Cl(£ U U 1/2). Since S was 
an arbitrary positive number, this implies Lemma 4.4. 



5. Probability of a close encounter. Here, we prove Lemma 4.6. The 
statement of this lemma was motivated by the heuristic consideration that 
9(Xn) — 9{Yn) is approximately Gaussian with variance of order V Ne. To 
give a rigorous proof, we need more precise asymptotics. Ito's formula tells 
us that 

9{X N ) - 9(Y N ) = \ VH{X £ S ) dW s + - AH(X £ S ) ds 
Jo 2 Jo 

v^/ VH(y s )dW s 
Jo 

+ V~e / [VH(XD - VH(y s )} dW s 
Jo 

■ / AH(Xl)ds. 
Jo 



(25) 



e 
2 



We will show that the main contribution comes from the first term, which 
is Gaussian. We expect that the last term is 0{eN) and the second term is 
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a martingale which grows like v V, where V is the quadratic variation. We 
expect that V grows as follows: 

N N 

j=i i=i 
This suggests that 6(Xn) — 9(Yn) is approximately Gaussian with correction 
which is slightly worse than 

o([( £ - 1 / 3 ) 3 /V/ 2 ] 1 / 2 ) = o(^). 

This appears bad because we want our process to visit the interval I e of 
length 0(y/e) with positive probability. We overcome this difficulty as fol- 
lows. We split N = ni + ri2, where n<i <C e -1 / 3 . During the first n\ iterations, 
we aim to hit an interval of size y/n^, which can be made larger than the 
correction to Gaussian behavior if n% 3> 1. During the last ri2 steps, we ar- 
range to hit I £ . Here, it is important that the contribution to the first integral 
(main Gaussian part) in the right-hand side of (25) from the last ri2 steps is 
larger than the correction accumulated during the first n\ steps. The prob- 
lem described above does not appear now since the correction accumulated 
during the last ni steps is 

0([n 3/ V/ 2 ] 1/2 )«v^- 
Below, we present the formal, if a bit tedious, implementation of the plan 
outlined here. 

We will need two lemmas concerning the deterministic process Y n . The 
proofs are given in Section A. 3. 

Lemma 5.1. For any 5 > 0, there exists some qo > such that 

\p-qp\>q~ l ~ S 

for all integers q> qo and p. 

Lemma 5.2. (a) There is a constant c such that for any x £T and any 
N > 0, we have 

i r / \ 

+ 



y 1 < c 

t d(Y n ,B)- 



iVln 2 iV 



In min d(Y n ,B) 
1 o<n</v v ' ; 



min < n <Ar(i(yr l ,.B) 



(26) 

(b) For any 5 > 0, there is a constant c such that if 
(27) d(Y n ,B)>y/£L 
for all n< N , then for all sufficiently small e, we have 

N r- 
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Let us take some < a < 1/8. We write r = T\ U T2, where T± and T2 are 
defined as follows. For x G Ti, we have N(x) > 2e~ a and for x E T2, we have 
iV(x) < 2e~ a . In the first case, we write N(x) = n\ + 112 with 712 = [s~ a ] < 
N(x)/2. In the second case, we let ni = 0, ri2 = N(x). 

Let || #|| denote the distance from zero of a point 6 on the unit circle and 
note that there is a naturally defined operation of addition on the circle. We 
shall need the following lemma (proved in Section A.l). Here, it is convenient 
to consider the processes on the plane M 2 , where the stream function H is 
single-valued. Let 

/•mhi(a,Ti ) ;>max((7,Ti ) 

1=/ \VH(X £ s )-VH(y s )\ 2 ds+ \VH(-)\ 2 ds, 

JO Jmin(cr,Ti) 

where the argument in the second integral is equal to y s if t\ < a and to 
if t\ > a. 

Let x a , x b G r. Let yf and y\ be the orbits of the unperturbed flow, which 
start at x a and x b , respectively, and let af and o~\ be the corresponding 
stopping times. Define 

/■min(cr^,o-?) /■max(cr^,(7?) 

3(x a ,x b )= \VH(y-)-VH(y b s )\ 2 ds+ \VH(-)\ 2 ds, 

JO Jmm(o-°,crJ) 

where the argument in the second integral is equal to y b if af < a\ and to 
y b otherwise. 



Lemma 5.3. For any R > and 5' ,5" > 0, there exists some Eq > such 
that for i6T and all e <£q, we have 

(28) pj sup \Xl-y s \ > £ ^ll-\ < e R l fd(x 1 B)>e 1 / 2 ~ s ', 

\s<max(<r,Ti) a\X,D)J 

(29) P x .(n > c(<5')|lne|) <e R for some c(5') if d(x,B) > e 1/2 - & ' , 

(30) P*(|#(^n) " H i*)\ > £l ' 2 ~ 5 ") < 

(31) E X I< £ -^^ t fd(x,B)>e 1 / 2 ~ s '. 



There is a constant k such that for x a ,x b G T, we have 

kd(x a ,x 
d{x a ,B 



(32) 3(x a ,x b ) < ^"'p? ifd{x a ,B)>Oandd(x a ,x b )<y(x a ,B). 



We split the proof of Lemma 4.6 into two parts as follows. 
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Lemma 5.4. (a) There is a positive constant k\ such that for all suffi- 
ciently small e and all x € Ti, we have 



F x (\\e(X ni ) + pn 2 \\<^e-)> 



'i 

where n\ = n\(x) and n 2 = n 2 {x) were defined above. 

(b) Let x 6 r and n<i < 2e~ a be such that \\0(x) + pn 2 \\ < s/n^. There is 
a positive constant k 2 , which does not depend on x and n 2 , such that for all 
sufficiently small e, we have 

F x (X n2 £l £ )> 



'n 2 

Lemma 5.4 implies that for sETi, we have 

^x{X N(x) £ I £ ) > F x (A a )F x (A b \A a ) > 

where ^4 a and A b are events from parts (a) and (b) of Lemma 5.4, respec- 
tively. Since n\ < N(x), we get the statement of Lemma 4.6 for xSTi from 
Lemma 4.5. For x € T 2 , we immediately get the statement of Lemma 4.6 
from part (b) of Lemma 5.4. 

Next, we prove Lemma 5.4. Consider first the case x £T±. Let o~q = and 
let a n be the time when the trajectory yt of the unperturbed flow returns to 
the set r for the nth time (this occurs at the point Y n ). Let s n = cr n ,+i — a n . 
Let 

fCTn 

£ n = V£ VH(y s )dW s . 



£ n is then Gaussian with zero mean and variance V n = e J^ n \VH(y s )\ 2 ds. 
Observe that since VH vanishes at A, we have 

(33) ci< r \VH(y s )\ 2 ds<c 2 

Jo 

and therefore c\en < V n < c 2 en for some positive constants c\ and c 2 . We 
now construct a coupling between 9{X n ) — 6{Y n ) and £ n . Namely, we claim 
that there exists a random variable £ ni with the same distribution as £ ni 
such that for any 5, R > 0, we have 

(34) f x (\\9(X ni ) - 9(Y ni ) - £ m || > a 1 ' 2 " 6 ) = 0(e R ). 

This inequality (with 5 < a) implies part (a) of Lemma 5.4. Indeed, 

(35) ||0(Y ni ) + n 2P \\ < ^N{x)e < ^2^i, 

so the event of part (a) of Lemma 5.4 happens if £ ni belongs to an interval 
of length 2(y/n 2 e — e 1//2_<5 ) whose center is at most \J2n\e away from zero, 
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and \\9(X ni ) - 0{Y ni ) - £ n J < e 1 / 2 "- 5 . Formula (34) and the fact that | ni 
is Gaussian allow us to conclude that the favorable probability is at least 
c^nisj \/n\E — 0(e ) for some positive constant c, which implies part (a) 
of Lemma 5.4. Thus, we need to prove (34). 

We now describe the construction of £„. Define W™ to be Wt for T n <t < 
T n+ \ and a Brownian motion (started at W Tn+1 ) independent of Wt and of 
Wt, k < n for t > T n+ \. Let 



n-l 



(36) Vn = V^I VH(y an+ t- Tn )dW t n , £n = 

3=0 



We want to compare fj n with 

(X„ +1 ) - tfpQ = / VH{X £ t ) dWl 1 + - / AH(Xf) dt 



(37) 

Equation (34) will follow if we show that 



Vn + Vn- 



(38) P x (jtifjj - % + f%)) > e 1 ' 2 - 5 ^ = 0(e R ), 

where 5 and R are the same as in (34). Fix < a\ < ai < < 5 such that 
5 < \ot\. Recall that we assumed that a < 1/8, while 5 < a can be taken 
arbitrarily small. Thus, without loss of generality, we may assume that 

(39) 3a + Qi + 25<|. 

Call n "good" if d{Y n ,B) > V^TlV /2 ~ ai and "bad" otherwise. This dis- 
tinction between good and bad n, is motivated by the fact that for good n, 
the distance d(X n ,Y n ) is much smaller than d(Y n ,B) with large probability, 
which will be very convenient in the arguments below. Further, formulas 
(28), (29) and (31) hold if we select x = Y n as the initial point and n is 
good. 

By Lemma 4.5 and part (b) of Lemma 5.2, the number of bad n such 
that n<N(x) is 0(e _Q2 ). By (30), the total contribution to the sum in 
(38) from fjj + ryj with bad j is bounded by 0{e 1 / 2 ~ a ' i ) [except on a set of 
probability 0{e R )\. The contribution to the sum from fjj with bad j is a 
Gaussian random variable with variance bounded from above by 0(e 1 ~ a2 ), 
due to (33). Therefore, 



max . 

0<n<n\ 



n 

(Vj-(Vj+Vj)) 

j = 0,j— bad 



l/2-<5\ 

> £ -^}=0(e R ). 
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Since R was arbitrary, this implies that 
(40) 



max 

0<n<ni 



E to 

j =0, j — bad 



,1/2-5' 



> 



0(e R ). 



The contribution to the sum from fj" with good j is bounded from above 
in absolute value by 



711 — 1 



0(e) £ (j 3+1 -ri, 

3=0,3- 



Therefore, due to Lemma 4.5 and formula (29), we have 



max r x 

0<n<m 



n-l 

E 

3=0, j -good 



>e 



2/3-ai 



0(e R ). 



Since R was arbitrary, this implies that 

n-l 



(41) 



" x max 

\ 0<n.<ni 



E # 

i=0,J-good 



> e 2 /3-«i =o(e R ). 



Therefore, formula (38) will follow if we show that 



Til— 1 



(42) 



E (V3-%)>e 1/2 - S 

\?'=0j'-good J 



0(e R ). 



Let denote the deterministic orbit starting from X n at time a n . Let 
s n = a(X n ). Define 



V~e VH(y2 n+t - T JdW t n . 



We want to estimate fjj — rjj by comparing both of them to fjj . Let £ be the 
first time when d(X n ,Y n ) > \^/ne 1 / 2 ~ OL1 . Define 

n n 

M n = E (Vj -flj)X{j<e}, <= E (Vj ~ Vj)X{j<ey 

j=0,j-good j=0j-good 

The quadratic variation of the martingale M n is equal to 

rTj+min(sj,Sj) 



V n -e E X{j<£} 

J=0,j-good 



\VH(y aj+t _ Tj )-VH(yi j+t _ T] )\ 2 dt 



+ 



Tj+man(sj ,Sj) 
Tj+min(sj ,Sj) 



\VH(-)\ 2 dt 
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(the argument in the second integral is either y or y, depending on whether 
Sj or Sj is larger). The quadratic variation of the martingale M' n is 

n 

j=0,jr'-good 



\VH(X!)-VH(yi j+t _ Tj )\ 2 dt 



rxaax(rj +1 ,Tj+Sj) 
+ / ^ \VH{-)\ 2 dt 



min (tj + 1 ,Tj +Sj 



(the argument in the second integral depends on whether Tj+i or Tj + Sj is 
larger), where Tj is the a-algebra of events determined by Wt by the time 
Tj. By (32), we have the estimate 

n i/2-ai 

(43) V ni -i < e d(Y- B) 

where the second inequality is due to part (b) of Lemma 5.2. We can estimate 
by using (31) instead of (32). Namely, 

(44) V' i < e V — < e 1 "" 2 

J=0j-good v J' I 

If i? > is fixed, then applying the L p -maximal inequality, we get 

if p is large enough. A similar argument shows that 
(46) Pf max \M' n _ x \ > £ l ' 2 ~ a A = 0{e R ). 

\0<n<ni / 

Therefore, we obtain 

wj ? E (%-^)>£ 1/2_<5 ) <0(e R )+¥ x (£< ni ), 

\j=0,j-good / 

so it remains to show that the last term is 0{e R ). To this end, we shall show 
that the main contribution to the distance between X n and Y n comes from 
the Gaussian term £ n . 

Define n 3 = 64e~ 2( - 5 - ai \ We have 



: (£ < ni ) = F x (£ < n 3 ) + P x (n 3 < * < m). 
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The first term can be estimated as follows: 

£ l/2-ai 



x(£ < n 3 ) < P x max \H{X n+l ) - H{X n )\ > 



X 



0<n<n 3 2fi3 

l/2-(3ai-2<5)- 

max \H{X n+l )-H{X n )\> — = 0(e R " 

0<n<n 3 12o 



by (30). On the other hand by (36), (37), (40) and (41), we have 

(47) d(X n ,Y n ) X{i >n} < \L - Mn-i - M' n _ x \ + Z n , 
where 

(48) Pj max \Z n \ > e 1/2 ~ s ) =0(e R ). 

\0<n<ni J 



It follows from (47) that 



P x {nz <£<n l ) <P( max 

\n3<n<ni 



\K-i\ ^ i i 



_ p 1/2-«i 



> -e 



n 



+P f max l%ii>L 1/2-^ 

Vn3<n<ni a/71 8 



^ 1 1/2— an 

max —= > —e 1 1 

n 3 <n<ni a/71 8 



+ r x max — =■ > -e 



l^n[ ^ 1 1/2-a 



n,3<n<m yjn 



From the definition of 72.3, it follows that for n> 113, we have -^/ne 1 / 2- " 1 /8 > 
e 1 / 2- * 5 and -y/ne 1 / 2- " 1 /8 > e 1 / 2- " 2 . Since is Gaussian and its variance is 
bounded by a constant factor of en, due to (33), we have 



for some positive constants k\ and &2, which implies that 

(49) P max ||n| > ^ )=0(e fl )- 

Hence (45), (46), (48) and (49) give 

F x {n 3 <t<n 1 ) = 0(e R ), 

implying (42), which completes the proof of Lemma 5.4(a). 

Let us now discuss the modifications needed to prove part (b) of Lemma 5.4. 
Assuming that x G V and ri2 < 2e~ a are such that \\9(x) + pri2 1| < xA^s, we 
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prove that d(Y n , B) > e 1 / 8 for all n < n 2 if e is sufficiently small. Indeed, we 
would otherwise get 

| (n 2 - n)p -p\< y/me + e 1/8 < 2e 1/8 

for some integer p since a < 1/8. This contradicts Lemma 5.1 since n 2 — n< 
2e~ a . 

We can now repeat the construction used in the proof of part (a) of 
Lemma 5.4. We claim that there exist 5 > and a random variable £ n2 with 
the same distribution as £ n2 such that for any R > 0, we have 

(50) P*(||0(X n2 ) - 9(Y n2 ) - |n 2 1| > e 1,2+S ) = 0(e R ). 

This inequality implies part (b) of Lemma 5.4 in the same way that (34) 
implies part (a) of the same lemma. 

In order to prove (50), we do not need to separate j < n 2 into good and bad 
[instead, we treat all j as we treated the good ones in the proof of part (a)]. 
The distance from Y n to B is now controlled using the inequality d(Y n ,B) > 
£ i/8 £ Qr a rj n < n2 . The sums in (43) and (44) can now be estimated by 

0(en 3 2 /2 £ - 1 /8 £ l/2-a 1) =0(e l+[3/8-(3a+a 1 )] ) =0 (e 1+2S ), 

due to (39). This inequality gives the improvement of 

£ l/2+S in 

(50) com- 
pared to 

£ l/2-6 in 

(34). The rest of the technical details remain the same as 
in the proof of part (a). □ 

APPENDIX 

A.l. Proofs of technical lemmas. In this section, we prove Lemmas 4.2, 
4.3 and 5.3. 

We shall need the following simple lemma (see, e.g., [10], where it was 
proven in the case n = 1, the proof for n > 1 being similar). 

Lemma A.l. Let X} and Xj* be the following two diffusion processes on 
M, d with infinitely smooth coefficients: 

dX] = v{Xl) dt + a{X}) dW t + y/evi(X})dt + ^eai{X}) dW t , 

dX 2 t = v(X?) dt + a(X?) dW t + y/ev 2 (Xt) dt + ^ea 2 {Xj) dW t , 

with Xq = Xq. Suppose that for a certain constant L, the following bound 
on the coefficients holds: 

\Vv i I , I Va ij I , \v\ I , \v\ I , \a % i \ , \a 2 3 \ < L, ij = l,...,d, 

where i and j stand for the vector (matrix) entries of the coefficients. Let 
/i be the initial distribution for the processes Xq and Xq. Then, for any 
positive integer n, some constant K = K(L,n) and any t, rj > 0, we have 

P,( sup \Xl-Xl\>n)< {eKt -} )£n . 
\0<s<t J V 
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Proof of Lemma 4.2. Let us consider the flow on the plane (so as 
to make H single-valued) and let A be one of the saddle points (there are 
countably many saddle points on the plane which correspond to the saddle 
point on the torus). Without loss of generality, we may assume that H(A) = 
0. Let B G I £ be the point which is carried to A by the deterministic flow 
(see the picture below). There are two branches of the separatrix that leave 
the point A. One goes inside the ergodic component, while the other forms 
the boundary between the periodic and the ergodic components. Let us take 
a point R on the latter, which is distance d away from point A (we measure 
the distance along the separatrix in the direction of the flow from A to R, 
and d will be specified below). Let us also take a point Q on the separatrix 
which is distance d away from point B (here, we measure the distance in 
the direction of the flow from Q to B). 

Let 7 be the curve which consists of two parts of the separatrix — between 
Q and A, and between A and R. In a neighborhood of any point x E 7, 
x ^ A, we can consider the smooth change of coordinates (x±,X2) — ► (p,9), 
where p = H/y/e and 9 is defined by the following conditions: |V0| = |Vi?| 
on 7, V# _L VH and 9 increases in the direction of the deterministic flow. 
In fact, we can extend this change of coordinates to the region defined by 

D = {(p, 9) : \p\ < 2, q < 9 < r}\{(cp, 6) : < ip < 2, 9 = a}, 

where 9(Q) = q< 9(B) = b< 9(A) = a< 9(R) = r. We needed to make the 
cut in the region D since the change of coordinates degenerates at A (see 
Figure 4). Let td be the first time when a trajectory of the process Xf 
reaches the boundary of D [or, rather, the boundary of the preimage of 
D in (xi,X2) coordinates, which will be also denoted by D). If d is small 
enough, then this set does not intersect V and therefore td < T\. 

In (cp, 9) coordinates, the generator of the process Xf takes the form 

(51) M £ f = \(f%\VH\ 2 + ef'g'el V9\ 2 + y/if' v AH + ef' e M) + f' e \VH\ \V9\. 

The set I £ in ((p,9) coordinates takes the form {((p,9) :\ip\ < 1,9 = b}. The 
part of the curve 7(e 1 / 2 ) that belongs to D takes the form {(ip, 9) : p = 1, a < 
9 <r}. It is sufficient to show that if the process starts at a point x in the 
former set, then it reaches the latter set before it reaches the boundary of 
D with probability at least c > 0. 

Let us take four points on the p axis: > pi > p>i > <ps > p>\ > — 2. Let 
5 > be a small constant which will be specified later. Define three segments 
inside the domain D as follows: I\ = {(ip, 9) : p>i < p> < 973, 9 = a — 5}, I2 = 
{(p, 9) : ip± < p < <p4, 9 = a + 5} and I3 = {(p, 9):p = l,a<9<r}. Let ti x 
be the first time when the process reaches I\\ similarly, t/ 2 and tj 3 are 
the first times when the process reaches I2 and ^3, respectively. By the 
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<P A 




(p =.-2 

FlG. 4. Change of variables. 



Markov property of the process, it is sufficient to show that there are positive 
constants 5, c\ , C2 and C3 such that 



(52) 



inf : 



< t d ) > a, 



inf P x (r/ 2 < t d ) > c 2 and inf F x (t I3 < t d ) > c 3 . 



For the first inequality, we define the domain D\ = {(<p, 0) \ <2,q<9< 
a — |} C D (assuming that 5 is fixed). Clearly, I e ,I± C D\. In this domain, 
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we consider the processes (<5f,©t) and ©t), which are defined by 

r- |V0| mrB ( |V0| \ , 

and 

d6 t = dt. 

The generator of the first process, after multiplying all the coefficients by 
|Vff| 2 , becomes the operator M e from (51). Therefore, the transition proba- 
bilities for this process are the same as for the original process Xf . We would 
like to apply Lemma A.l to the pair of processes @t) and (<J? t ,0 t ). 

Let us follow the process ©j) starting at x G I £ for the time to = a — 
5/2 — b. It can easily be seen that the probability that this process reaches 
I\ before time to an d before leaving D± is bounded from below. Moreover, 
the same is true for all the small perturbations of the trajectories of (3> t , ©*)■ 
More precisely, there are positive constants c\ and C2 such that for any x G J e , 
there is an event £1', whose probability is at least c±, with the property that 
if u G fi', then for any function (ip(t),9(t)) : [0,to] — ► Di which satisfies 

SUp ||(^(t),fl(t))-(¥ t ,© t )H||<S2, 
*6[0,to] 

we have (p(t),0(t)) G Z?i for all f G [0,to] 5 an d (<£>(to), 0(tq)) G 7i for some 
r G [0,to]. 

Note that |V0|/| Vi/| — ► 1 uniformly in Z?i as e — > since we are consider- 
ing d to be fixed for now and the domain D\ is a positive distance away from 
the saddle point. Therefore, we can consider ($t, 0^) as a small perturbation 
of the process t ) in Z?i. Let fi" be the subset of O' consisting of all cj 
for which 

sup \\($ t ,@ t )(u)-($ t ,Q t ){u)\\ >c 2 . 
te[o,to] 

By Lemma A.l, we can make sure that the probability of Q," is less than 
ci/2 for all sufficiently small e. We thus obtain the first inequality in (52) 
with ci = ci/2. 

The third inequality in (52) can be proven in exactly the same way if we 
consider the domain Di = {(if, 6) : \ip\ < 2, a + | < 8 < r} instead of D\. 

Finally, we claim that there is a sufficiently small 5 such that the second 
inequality holds. Let us consider the intervals I\ and I2 in the original co- 
ordinates. By the Morse lemma, there is a smooth change of variables in a 
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neighborhood O of the saddle point A such that in the new variables, the 
stream function is H(xi,x 2 ) = x\X 2% and the part of D where cp was negative 
now lies in the first quadrant xi,x 2 > 0. In the new variables, the generator of 
the process X £ , after a random change of time, becomes L £ f = eLif + v{Vf, 
where 

d 2 f d 2 f d 2 f 

Lif = a 1 i(xi,x 2 )-^ + ai 2 (x 1 ,x 2 )-g^-^ + a 2 2(xi,x 2 )-^ 

+ bi(x\,x 2 )^- + h{xi,x 2 )^- 
OX\ ox 2 

is a differential operator with first and second-order terms with bounded 
coefficients and where vi{x\,x 2 ) = {—x\,x 2 ). We shall consider the operator 
L £ in the domain V s = O Pi {x\ > 0;x 2 > 0;x\ + x 2 > e 1 / 3 }. We make a 
further change of variables in T> £ : 



(xi,x 2 ) ->• (u,v) = (^j^,x 2 - x\ 



In the new variables, after dividing all the coefficients of the operator by 
{x\ + x 2 ), which amounts to a random change of time for the process, the 
operator can be written as 

(53) L £ f = N £ f + ^f, 

ov 

where N £ is a differential operator with first and second order terms. We 
claim that all the coefficients of N £ are of the form 

ci(xi,x 2 )pi(xi,X2)v / £ + c 2 (xi,x 2 )p 2 (x 1 ,x 2 ) + C 3 (xi,X2)e + C4(xi,X 2 )y/e 

Xl + x 2 

where c±, . . . ,04 are bounded functions and p± and p 2 are homogeneous first 
and second degree polynomials, respectively. Indeed, using the expression 
for (u,v) in terms of (x±,x 2 ), we can write 

d 2 f 



eai 2 (xi,x 2 ) 



dx\ dx 2 

2f X2 



( 9 f o f df d 2 f\ 

= a 12 (xi,x 2 ) x\x 2 — j + (x 2 - xi)y/e——— + e— T . 

\ ouov ou ov z J 

The other terms of the operator L\ can be treated similarly. 

Therefore, all the coefficients of N £ can be made arbitrarily small in T> £ 
by first selecting a sufficiently small neighborhood O of the point A and 
then taking e to be sufficiently small. 

We can now compare the process whose generator is the operator (53) with 
the deterministic process with generator If we take 5 to be sufficiently 
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small, then both I\ and I2 are inside T> £ . Further, the transition time from I\ 
to I2 for the deterministic process with generator ^ is uniformly bounded 
in the initial point. The second inequality in (52) can now be deduced from 
Lemma A.l in the same way as was the first inequality. □ 

Proof of Lemma 4.3. The proof of Lemma 4.3 is similar to the ar- 
guments which can be found in [7], where the behavior of the process Xf 
in a neighborhood of a saddle point was studied in detail, so we shall only 
indicate the main steps. As before, we assume that A is one of the saddle 
points of H on the plane and that H(A) = 0. Let O be a small neighborhood 
of A. There are two branches of the separatrix which enter the point A and 
two which leave it. If O is small enough, then, in a neighborhood of each of 
the branches of the separatrix intersected with O, one can make the change 
of coordinates (x±,X2) — ► (H,6), where 9 is defined as before, separately in 
the neighborhood of each of the branches. 

Let us take four points, (Pi,Qi,Ri,Si), 1 < i < 4, on each of the four 
branches, defined by the conditions |0(P;) - 6{A)\ = 5, \6(Qi) - 9{A)\ = 25, 
\6{Ri) - 9{A)\ = 35 and \9(Si) - 9{A)\ = 45. We can number the branches 
in such a way that the first and the third branch are the stable directions 
for the deterministic flow, while the second and the fourth are the unstable 
directions. Moreover, we can assume that P2 is carried into P3 (and then to 
A) by the flow (see Figure 5). 

By the Markov property it suffices to show that there are constants c, r > 
such that for all x S Cl(£ U U e i/i) and all sufficiently small e, we have 

(54) Pa.(r<c|lne|) >r. 

The flow Xf can be viewed as a small perturbation of the deterministic flow 
defined by y' t = v(yt), which carries V into T in finite time. This, together 
with Lemma A.l, implies that 

(55) inf F x (T<ci)>n 

for some positive c\ and r\. Let us define four nested neighborhoods of the 
point A as follows. The neighborhood V is bounded by eight smooth curves, 
which are the level sets \H\ = 5 and 9 = 9(Pi), 1 < i < 4. Similarly, Q, 1Z and 
S are bounded by \H\ = 25 and 9 = 9(Qi), 1 < i < 4, \H\ = 36 and 9 = 9{R i ) 1 
1 < i < 4, and \H\ = 45 and 9 = 9(Si), 1 < i < 4, respectively. 

Let r-p be the first time when the process Xf reaches r" or V, whichever 
happens first. It is clear that the time it takes for the unperturbed process 
which starts in Cl(£ U U e \/i) to reach T' or V is uniformly bounded in the 
initial point. This, together with Lemma A.l, easily implies that 



(56) 



inf P x (t v < 02) > r 2 
xeCi(£u[/ £l/2 ) 
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Fig. 5. Exit from a neighborhood of a saddle point. 



for some positive C2 and r 2 and all sufficiently small e. Due to (55) and (56), 
we see that it is sufficient to establish (54) for x G Cl(£ U U £ i/2) n V. 

Let Qi, Q2, Q3 and Q4 be the parts of the boundary of Q which are 
given by 9 = 6(Qi), 9 = 6(Q 2 ), 9 = 0(Q 3 ) and 9 = 9(Q A ), respectively We 
shall use similar notation for the parts of the boundary of 1Z and S. It 
follows from the arguments in Section 4 of [7] that if the process Xf starts 
at a point x G V, then it leaves Q either through Q2 or through Q4 with 
probability which can be made arbitrarily close to one, uniformly in x £ V, 
by considering sufficiently small e. Further, the expectation of the time it 
takes for the process Xf to exit Q is bounded from above by a constant 
factor of |lne|, as follows from Lemma 4.6 of [7]. Therefore, it is sufficient 
to establish (54) for x £ Cl(£ U U e i/s) n (Q 2 U Q 4 ). 

Any trajectory of the unperturbed flow which starts at x G Q4 reaches T' 
in finite time and so, with the help of Lemma A.l, we get that 

inf F x (t< c 3 )>r 3 

for some positive C3 and r 3 and all sufficiently small e. As for initial points 
in 0,2, the trajectories of the unperturbed flow which start at x G Q2 reach 
7^3 in finite time. Using Lemma A.l again, we see that it is now sufficient 
to establish (54) for x G Cl(£ U C7 e i /2 ) n TZ 3 . 
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For the process Xf which starts at x S Cl(£ U U e y%) n 7^3, there is a 
uniformly positive probability that it exits S through S4, as can be seen 
with the help of the arguments used in the proof of Lemma 4.2. Further, 
the expectation of the time it takes for the process Xf to exit S is bounded 
from above by a constant factor of | lne|, as follows from Lemma 4.6 of [7]. 
Thus, (54) now needs to be established for x £ S4. 

The time it takes for the deterministic process which starts at x G £4 to 
reach V is uniformly bounded in e and in the initial point. Therefore, we 
get the desired result by again applying Lemma A.l. □ 

Proof of Lemma 5.3. We first establish two inequalities which are 
slightly different from (28) and (30), namely 

/ „ 1/2-6" X 

(57) P x sup \X £ S -y s \> —— < e R , if d(x,B) > e 1 / 2 " 5 ' , 

\ S <<7+1 d(x,B)J 

(58) fJ sup \H(X £ S )-H(x)\ >e 1/2 ~ 5 ") <e R , if d(x,B) > e 1/2 ~ 5 ' . 

\S<<7+1 J 

The proof is based on the use of Lemma A.l. We cannot, however, apply 
Lemma A.l to the pair of processes Xf and yt directly, since a grows loga- 
rithmically in e when d(x, B) = e 1 / 2 " 5 . 

It is clear that a + 1 < c\\ lne| for some constant c\ which depends on 5' , if 
d(x,B) > e 1 / 2 " 5 '. We claim that the deterministic flow yt has the following 
property (this is a minor modification of a similar statement from the proof 
of Lemma 4.6 of [10], so we do not prove it here). 

Let = to < t\ < t% < • • • < t m = a + 1 and < x < min(5', 8"). Consider 
a process y t which solves the equation 

(59) dy t = v(y t )dt 

on each of the segments [to,ti), [ii, ^2), • • • , [tm-i,tm], with a finite number 
of jump discontinuities lim^t i+ y t — lim t ^ ti _ y t =pi, i = 1, . . . , m — 1. Then, 
under the conditions 

m— 1 

yt = y t0 , d( yt0 ,B)>e 1 ^ s ', Y, IWI<£ 1/2 ^, 

i=l 

we have 

Y™r l \\pi\\ 

(60) sup \\y t -y t \\ < c 2 - ' 



0<t<Un d{y to ,B) 

for some constant c 2 which does not depend on y tQ . 

Let K be the constant from Lemma A.l applied to the pair of processes 
XI and j/i, with some n taken to be sufficiently large so that x(2ra — 1) > R. 
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Select the points = to < t\ < £2 < • • • < t m = a + 1 in such a way that 
2^| lne| < — ti\ < lne| for i = 0, . . . , m — 1. Since a + 1 < c\\ lne|, we 
have the estimate m < 1c\Kj x. 

Let y\ be the piecewise continuous process defined by the conditions y\ ' . = 

Xf. and dyf = v(yf) dt on [ti, tj+i), i = 0, . . . ,m — 1. By Lemma A.l, 

/m-l _ \ „>f|lne| n 

P x S SU P ll^ £ -^ll>g 1/2 -" <m I ,\ 2n 



(61) 



< ^V +2 V(2n-l). 



Due to the continuity of formula (61) provides an estimate for the sum 
of the jumps of the process y\ . From (60), it now follows that 

(62) VJ sup \\yt-m>c 2 ^^)<( 2 -^) 1+2n e^ n - 1) . 
\0<t<a+l d(x,B)J \ X J 

This, together with (61), implies (57) for all sufficiently small e since x(2n — 
1) > R. Since H(x) is constant and H(yf) is piecewise constant, we have 

sup \H{X £ t )-H{x)\ >e 1 ' 2 - 5 " 

0<t<cr+l 

/m-1 _ cl/2-5" \ 

<PJJ2 SU P W x ?-y £ t\\> uWWn)- 

\£> te[U,U+i) sup||VF||; 

This, together with (61) and the condition >c < 5", implies (58) for all suffi- 
ciently small e. 

We take 5" < 5' in (57), and note that if a trajectory of Xf stays within a 
neighborhood of the deterministic trajectory y s for time a + 1, then 
t\ < a + 1. Therefore, (57) and the fact that a + 1 < c± \ lne| together imply 
(29) with c = ci. 

From (57), it follows that P x (ti < a + 1) > 1 — e R , therefore (57) implies 
(28). 

To prove (30), we consider two cases: d(x,B) > e 1 / 2 5 and d(x,B) < 
e 1 / 2 "" 5 [i.e., we set 5' = 5" in (58)]. In the first case, the result follows from 
(58) and the fact that P x (t~i < <t + 1) > 1 — e R . To deal with the latter case, 
we define the set D as the union of all the trajectories of the deterministic 
flow starting at x G T with d(x, B) < e 1 / 2 " 5 and followed for a positive 
or negative time until they hit V . Thus, Cl(D) contains all the points of 
£ which are carried too close to the saddle point by the evolution of the 
deterministic flow starting at x before the time when the trajectory reaches 

r'. 



40 



D. DOLGOPYAT AND L. KORALOV 



Observe that our proof of (58) [and consequently of (30)] works not only 
for initial points x G T with d(x, B) > e 1 / 2 "" 5 , but for all points x G S\ Cl(D). 
For x G Cl(-D), we define tjj to be the first time when the process X| reaches 
the boundary a£UUC\(D). Note that \H(X E TD ) - H(x)\ < e 1 ' 2 - 5 " and that 
(30) is true for x = X £ . D . This implies that (30) holds for all x G £ with 
2e l / 2 ~ & " , instead of e 1 / 2 "" 5 " , in the left-hand side. Since 5" was arbitrary, 
this is equivalent to (30). 

Let us now prove (31). The estimate on the expectation of the first term 
of I, 



rmm(<r,Ti ) —1/2—6" f 

E x iVHiX^-VHiy^dsKCi—— if d(x, B) > e 1 / 2 " 5 , 

Jo d(x,B) 



is an immediate consequence of (28) (here, c\ is some constant). Let E be 
the event 

r e V2-*" 



d(x,B 

where the constant k will be specified below. Then, 

/ r max(a,n) \ 1/2-5" 

E X ((1- XE ) / \VH(.)\ 2 ds <c 2 — — ifdCaj.Bj^e 1 / 2 -*. 

V Jmin(<7,Ti) / d{X,B) 

From (57), it easily follows that for sufficiently large k, we have ¥ X (E) < 
e R if d(x,B) > e 1 / 2 " 5 (A; needs to be chosen depending on the minimum 
of |V-ff| in a neighborhood of T). Using the fact that the process X^, 
is uniformly (in e) exponentially mixing, it is not difficult to show that 
^x( T l > t/e) < ce~ l l c for some positive constant c and all x G T 2 . In partic- 
ular, Ea;(riX{ Tl >i/ e 2}) = o(e R ) for any positive R. Recall that o < c\\ lne| if 
d(x,B)>e 1 / 2 - 5 ' . Therefore, 

rmax(ff ' n) lT7rr/ , 2j .\, D (_, _ , 

/ min((T,ri) 



eJxe / |V^(-)| 2 ^ <Ex X^ + ^f + nX{ T1 >l/ei} 

<c 3 s R - 2 iid(x,B)>s 1 / 2 - 5 '. 



Combining the above estimates, we obtain that (31) holds with an extra con- 
stant factor in the right-hand side. Since 5" was arbitrary, this is equivalent 
to (31). 

Finally, (32) is a statement about deterministic flows. Using the Morse 
lemma, this statement can be reduced to elementary estimates for a linear 
system in a neighborhood of the saddle point. □ 
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A.2. Mixing property for X^. and X^. . This section is devoted to the 
proof of Lemma 2.3. We need a standard fact concerning the transition 
density for a diffusion process which starts in the interior of a domain and 
is stopped when it reaches the boundary. The proof of this lemma is similar 
to the proof of Theorem 21, I in [12]. 

Lemma A. 2. Let L £ (x) be a family of differential operators in a con- 
nected domain D with a smooth boundary, which are uniformly elliptic in 
x G D and e, and whose coefficients and their first and second derivatives 
are uniformly bounded in x £ D and e. Assume that L e (x) are generators 
for a family of diffusion processes Xf in the same domain. Let a be the first 
time when the process Xf reaches the boundary of D. Let n x be the measure 
on dD defined by fJ> £ x (A) = P x (X e . £ A), where A is any Borel subset of 3D. 
Let U be a domain, whose closure is contained in the interior of D. 

Then, there is a constant c > 0, which does not depend on e, such that 
^1(^4) > c\(A) for all x £ U and all Borel sets A C dD, where X is the 
Lebesgue measure on dD. 

Proof of Lemma 2.3. Let us prove the result for ^ (the result for 

then follows immediately). As follows from [3], page 197, it is sufficient 
to prove that there exist a constant c > and a curve 7^7 (which do not 
depend on e) such that P2(x,dy) > c\(dy) for all x £ 7, y £ 7, where A is 
the Lebesgue measure on 7. 

In the domain between 7 and 7, we may consider a smooth change of coor- 
dinates (27, X2) — ► (y, 0), where ip = H/y/e and 9 is defined by the following 
conditions: |V#| = |V-ff| on 7, V# _L VH and 9 increases in the direction 
of the deterministic flow. We may assume that 9 £ [0,/ 7 |Vi?| dl] (with the 
endpoints of the interval identified) and that the saddle point corresponds to 
9 = 0. Fix six points oi,...,06 on 7, which satisfy < 9{a\) < ■ ■ ■ < 9(oq) < 
f |V-ff| dl. We take 7 to be the interval between 03 and 04. Let J be the 
interval [in (ip, 9) coordinates] defined by 1/2 < </? < 1, 9 = a2- 

Any realization of the process, which starts at 167, must pass through 
the level set 7(e 1 / 2 ) before hitting 7. It is easy to show (see, e.g., the argu- 
ments in Lemma 3.1 of [10]) that a realization which starts at x £ ^(e 1 / 2 ) 
goes through J before reaching 7 with probability bounded from below by 
a positive constant c$ which does not depend on x or e. Therefore, it is 
sufficient to show that there is a positive constant c\ such that 

(63) ¥ x (X £ a £d9)> Cl \{d9) 

for all x £ J, 9 £ [03, 04] and all sufficiently small e. 

For #0 £ [«3>«4] we denote the rectangle ki^/e < ip < 6>o — foe < 

9 < #0 + by TZ(9q, e, k±, fo) or simply by 1Z (see Figure 6). Let t-ji be the 
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Fig. 6. Definition of the rectangles. 



first time when the process enters this rectangle. We would like to show that 
there are positive k±,k2 and C2, which do not depend on 9q and £, such that 

(64) P*(t^<cx)>c 2 £ 

if x £ J. First, however, let us prove that this inequality implies the state- 
ment of the lemma. In (ip, 9) coordinates, the generator of the process Xf 
takes the form 

M e f = ^-(f^\VH\ 2 + efH e \V0\ 2 + v^Atf + eft AO) 

(65) 

+ -f e \VH\\ve\. 

Consider, also, a larger rectangle < tp < Akty^e, do — 2&2£ < < 9o + 2k2e, 
which will be denoted by 1Z(9o,e,ki,k2) or simply by 1Z. We can make a 
further change of variables in 1Z, namely ((/?, 9) — > (ft, 9), where <p> = <p/\fs 
and 9 = {9 — 9o)/e. In the new coordinates, the generator becomes 

M £ f = —( -f'UVH\ 2 + -/—I W| 2 + f~AH + KAO) + ^fUVH\\V9\. 
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After dividing all the coefficients by |V-ff| 2 /e 2 , which amounts to a random 
change of time for the process, we obtain the new operator for the time- 
changed process, 



«'f = \ (fk + /sill + ^jffp + ^A«l Vff I 2 ) + / J- 



We can now apply Lemma A. 2 to the process with the generator M e in the 
domain 1Z with the initial point x ElZ [note that 1Z and 1Z are fixed sets in 
(<p, 8) coordinates and that we can smooth out the corners of the rectangle 1Z 
so that it becomes a domain with a smooth boundary]. We conclude that for 
any x G TZ, the measure fi%(A) = F X (X% £ A), where A C {(p = 0,-1 <8 < 
1} C &1Z, is bounded from below by a measure whose density with respect 
to d9 is equal to a positive constant (which we shall denote by C3). 

Since d9 = dO/e, the measure /i| (in 8 coordinates) is bounded from below 
on the interval [6q — e,6q + e] by the measure whose density with respect 
to d6 is equal to c^/e. Combining this with (64) and using the Markov 
property of the process, we obtain that (63) holds for all 8 G [8q — e, 8q + e] 
with ci = C2C3. Since #0 G [ fl 3> a 4] was arbitrary, we see that it remains to 
prove (64). 

We define the following process in (<p,6) coordinates: 



r |V0| mr6 ( IV0I eA9 \ , 

where Wf and W® are independent Brownian motions which can be consid- 
ered to be defined on different probability spaces (fF, J^, P v ) and (ft 9 , F e , P e ) 
We shall denote the product of these spaces by (0,JT, P). The generator of 
this process, after multiplying all the coefficients by \VH\ 2 /e, becomes the 
operator M £ from (65). Therefore, the transition probabilities for this pro- 
cess are the same as for the original process Xf. 

We could also consider the change of variables (x±, x?) — ► (</?; 0) in an open 
set which contains the interval {H = 0,a\ < 8 < o,q} and does not depend 
on e. Thus, the process (3>t, @j) which starts at x = ((^0,02) G J can be 
defined until it exits the set S = {— c/ y/e < tp < c/ s/e, a\ <8 < a^}, where c 
is sufficiently small. Let ts be the first time when the process exits the set S. 
Using Lemma A.l, it is easy to see that su]) xG jP x (ts < — 02) < e 2 for all 
sufficiently small e. Let (po be fixed and denote the event where T5 > 05 — 02 
by S s . 

Let D denote the domain {(<p, 8) : < <p < 3, a\ < 8 < a^} and let the first 
time when the process leaves this domain be denoted by td- Note that 
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|V.ff| = |V0| for ip = and that \S7H\ and |V#| are smooth functions of 
(H, 6) in D, while H = v^V- Therefore, after rewriting the equations in 
integral form, we obtain 

(66) $ t = <po + W? + V£ [ gl($ s ,e s )ds, 

Jo 

(67) S t = a 2 + V^W t e + e [ g £ 2 {<5> s ,® s )dW e s + t + f ds, 

Jo Jo 

where gf, g 2 and g\ are bounded in C 1 (D) by a constant which does not 
depend on s. Note, also, that g\ is a bounded function in the domain S with 
a bound which does not depend on e. 

Formulas (66) and (67) show why it is reasonable to expect (64) to hold. 
They suggest that 

*t ~ + Wf, 

® t ^a 2 + t + ^/e ( g 3 {ip + Wf,a 2 + s)ds + y/eWf. 
Jo 

Let to = Go — a 2 . We want ipo + Wf to lie in an interval of size 0(y/e), which 
happens with probability 0(y/e). We also want W® to cancel J g3((fo + 
Wf,a 2 + s) ds up to an error of size no larger than 0{^/e). The first event, 
involves only Wf , while for the second event, we may assume that a real- 
ization of wf is fixed and that W® is independent of Wf . So, the required 
cancellation happens with probability 0{yfe). Multiplying those probabili- 
ties, we get (64). Let us now proceed with the precise argument. 

Consider the time interval i" = [t — ^/e, to + y/e\ . Let E^ C W be the event 
that there is a time n such that tj e I, ipo + Wfi = Ikwfe and tpo + Wf £ 
(2/ci-y/e, 2) for all < s < tj (k\ will be selected later). Let us note that on the 
intersections of the events Eg H [E^ x f] e ), we have the following estimates. 

(a) Since g\ is bounded on S, there is a constant c\ such that 

/■OS -02 

/ |^(* a ,0,)|da < cix/i- 
Jo 

Therefore, if we take k\ > c\ we obtain <J> T7 £ [fci^/e, 3/ci-v/e] and $ s € (fci-v/i, 3) 
for all < s < 77 . Thus, ($ a , 9 S ) € D for < s < rr. 

(b) Let 5 C Q be the subset of Es n x fi e ) where sup 0<i<T/ e / * g 2 ($ s , 
@t) dW® > \f~£. Since is bounded on D, it is easily seen (e.g., using moment 
inequalities for martingales) that ¥(£) < e 2 for all sufficiently small e. Let £ 
be the complement of £ . 

(c) Since g§ is bounded on D, there is a constant C2 such that 

m 

/ |53($ s ,9 s )|ds < c 2 Ve. 
Jo 
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Let £ C Q? be the event that sup 0<s<a5 _ a2 \W®\ < r, where the constant 
r will be selected later. On the event £g D (£ v X £ 9 ) n £, we have 

sup |G t - (a 2 + t)\ < y/e{r + l + c 2 ). 

0<i<r/ 

Therefore, on this event, we have 

S TI = a 2 + yfeW° Tl + e H g e 2 {Wf ', a 2 + a) dW s e 
Jo 

rTj 



rri 

+ r/ + y/e / 53 (W/ , a 2 + s) ds + i?, 
Jo 



where the remainder R satisfies ¥(R > c 3 e) < e 2 , the constant c 3 depending 
on r [here, we used the fact that g\ and g 3 are bounded in C 1 (D)]. For each 
G O^ the realization WJf is fixed and 77 is just a constant. Since g 3 is 
bounded, we have the estimate | /J 7 g 3 (Wf,a 2 + s) ds| < c 4 if G £* '. 
As a function of 6 the random variable 

U {J) = yfeW 9 ^ + e P g £ 2 {Wf,a 2 + s) dW e s 
Jo 

is Gaussian with the variance equal to e /J 7 (1 + y/eg 2 (Wf, a 2 + s)) 2 ds. 

We are interested in the restriction of ^^(uj 8 ) to the set £ e . Since g 2 is 
bounded, it is not difficult to show that for large enough r and all sufficiently 
small e, there is a constant C5, which does not depend on ui^ and e, such 
that for any Borel set A C [—(2 +C4)y / i, (2 + c^-y/e], we have 

(68) PV G G A) > ^=X(A), 

V £ 

where A is the Lebesgue measure. 

Let us collect all the pieces of the proof together. Take k\ >c\. With k± 
fixed, we have P^(£T) > q\yfe for some constant q\, which does not depend 
on e. Let us select r large enough so that (68) holds for all uj^ G £ <p . 

If {^,oj e ) e£ s n {£* n £ e )n£, then 9 rj G [6 - k 2 s,e + k 2 e], provided 
that R < c 3 e and 



%-k 2 e + c 3 e- (a 2 + Ti + y/I * g £ 3 (Wf ,a 2 + s) ds^j , 
% + k 2 e - c 3 e - (a 2 + 77 + y 7 ^ ' 9 e 3 {Wf , a 2 + s) ds^j 



Take &2 > 2C3. Then, the interval on the right-hand side of this formula is 
of length at least k 2 e and is centered not further than (1 + C4)y/e from the 
origin. Therefore, by (68), for each G £ ip fixed, the P^-probability of this 
event intersected with £ e is at least k 2 c^y/e. Since P(£s H £ H {i? < Qe}) > 
1 - 3e 2 and PP(£P) > q ly /e, we obtain that 

P(($ r/ , 6 r/ ) e1Z,ti < (t) > qiVek 2 c 5y /e - 3e 2 > c 2 e, 
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where c<i = q\k2C^/2 and e is sufficiently small. This justifies (64) and thus 
completes the proof of Lemma 2.3. □ 

A. 3. Diophantine approximations. Let us recall some facts about contin- 
ued fractions (see [9]). Let p = [oi, . . . a n , . . .] be an irrational number between 
zero and one written as a continued fraction. Let p n /Qn = [o>i , 0,2 • ■ • a>n] be 
the nth convergent. Then, P2n/Q2n increases to p and P2n-i/l2n-i decreases 
to p. Also, q n = q n ^ia n + q n -2i so q n grows at least exponentially. Namely, 

(69) q n > f n (the nth Fibonacci number). 

Thus, (2) implies that there is a constant c such that 



(70) 



Qn+i < cq n In 

Hn- 



Let II • II denote the distance to the nearest integer. A rational number p/q 
is called the best approximation of the second kind to the number p if 

\p - qp\ <\P- qp\ 

for all rational numbers p/q 7^ p/q such that < q < q. Thus, \\qp\\ is minimal 
among all \\qp\\ with < q < q. Then (see [9], Section 6), the best approxi- 
mations of the second kind are exactly the convergents p n /q n - From here, it 
follows that 



(71) Ml 

It is known that 
(72) 

and 
(73) 



< 



implies that q > q n +i ■ 



Pn 

p 

q n 



< 



1 



Pn 
p 

q n 



> 



q n (qn + q n +i) 



Let us show that condition (2) holds for a set of p which has Lebesgue mea- 
sure one. Indeed, let p be the measure on [0, 1] with the density ((ln2)(l + 
x))- 1 . Then (see [9]), 



p(a n >n)= p{a x >n) = — In 

In 2 



1 , /n 2 + 2 



n 2 + 1 



(ln2)(n 2 + l) 



1 



n 2 + l 



so, by the Borel-Cantelli lemma, condition (2) holds for /z-almost all p and, 
therefore, also for Lebesgue almost all p. 

We still need to prove Lemmas 4.5, 5.1 and 5.2. 

Proof of Lemma 4.5. Let N(x) = min{n : d(Y n , B) <e 1//3 }. We claim 
that for any 82 > 0, all sufficiently small e and all x G T, we have N(x) < 
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e ( 1 / 3+<5 2). Applying this claim (with 82 < 5) to Y[ £ -i/3 + i] instead of x, we 

see that for all x, there exists some n such that [e -1 / 3 + 1] < n < [e™ 1 / 3 + 
1] + e~( 1 / 3 + 52 ) and d(Y n ,B) < e 1 / 3 , thus establishing Lemma 4.5. 

To prove the claim, take the smallest m such that ||(?m,/o|| < e 1 / 3 . Suppose 
that m is even (the case of odd m is almost identical) so that q m p > p m . 
Consider the interval J = {x:0 < 8{x) < \\q m p\\}- Let Nj(x) be the first 
positive time when Y n £ J. Then, N < Nj so it is enough to show that 
Nj{x) < e~( 1 / 3 + <52 ). The maximum of Nj is achieved on J since if x ^ J and 
6(y) = 6{x) - p, then Nj(y) = Nj(x) + 1. 

Note that Nj is piecewise constant on J and its discontinuities are the 
preimages of the endpoints of J under the first return map to J. Our choice 
of J ensures that there is only one discontinuity inside J. Indeed, let y £ 
Int(J) be the preimage of the left endpoint. Then, < 6{y) < \\q m p\\ and 
9{y) + sp = t for some s,t £ N. Then, ||sp|| = 9{y) < \\q m p\\- Thus, s > q m +i 
by (71). On the other hand, let y* be such that 

0(y*) = \\Qm+ip\\ =Pm+i ~ q m +iP (m + 1 is odd). 

If we take s = q m +i, then 

8(y*) + sp = p m+ i - q m +\P + q m +\P = Pm+i , 

so y* is the preimage of the left endpoint of J. 

Likewise, let y G Int( J) be the preimage of the right endpoint. Then, 

6(y) + sp = q m p -p m + t, so 

(74) \\sp\\ = \\q m p\\ - 9{y) < \\q m p\\, 

(75) \\{s-q m )p\\=0{y)<\\q m p\\. 

Formulas (74) and (71) imply that s > q m +i and therefore s — q m > 0. Then, 

(75) and (71) imply that s — q m > q m +i- But, 

9{y*) + (q m + q m+ i)p = p m+ i + q m p= \\q m p\\ +Pm+l+Pm- 

Hence, y* is also the preimage of the right endpoint. Thus, Nj has only one 
discontinuity on J and so it takes only two values: 

Jm+l + q m , if < 6(y) < \\q m+ ip\\, 
Im+i, if lkm+ip|| < 9(y) < || qmP\\ ■ 

From inequality (72), with m — 1 instead of n, we obtain that q m < 
l/\\q m -ip\\ < e" 1 / 3 . Using (70), we obtain q m + q rn+1 < e -( 1 / 3 +<52) ; w hi c h 
completes the proof of the lemma. □ 

Proof of Lemma 5.1. Formulas (73) and (70) together imply that for 
some constant c, we have 

(76) \p n - q n p\ > — —2 — • 

<?nhr q n 



Nj(y) 
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Now, for any p and q, we can find n such that q n < q < q n +x- We have, 
therefore, 

\p - qp\ > \p n+1 - q n+1 p\ > 



q n+ iln q n+ i 



> 



> 



q n In 2 q n In 2 (q n In 2 q n ) q In 2 q In 2 (q In 2 < 
for some c' . This implies the statement of the lemma. □ 



Proof of Lemma 5.2. Take the smallest n such that q n > N. Recall 
that the Denjoy-Koksma inequality (see [2], Lemma 1 of Section 3.4) says 
that if q n is the denominator of the nth convergent, then for any function </> 
and any points xi,x 2 on the unit circle, 



<?n-l 

+ 3P) ~ 4>( X 2 + jp)] 

3=0 



< Var (0), 



where Var(^) denotes the variation of 4>. Let 4>{z) = 1/IMI if \\z\\ > 
min n <7v d(Y n , B) and <f){z) = 1/ min n <7v d(Y n , B) otherwise. Applying the 
Denjoy-Koksma inequality with x\ = 6{x) and X2 such that 



V 4>(x 2 ) =q n 4>(z)dz, 

3=0 J ° 



we get 



Now, 



/' 

Jo 



N 1 



< Or, 



+ Var(0). 



min ri < A r(i(y n ,5) 



Var(» < 



min n <Ar d(Y n ,B)' 



By (70), we have q n < cq n -\ In 2 q n -\ < cNln 2 N, which proves (a). 

To prove (b), let J m = [[N 5rn / 2 ], [iV 5 ^ 1 )/ 2 ]], where < m < [2/5]. The 
union of these segments covers the interval [l,iV]. Since n < [N s ( m+1 ^ 2 ] on 
J m , it follows by part (a), applied to the initial point Yr N 6 m /2i instead of x, 
that there is a constant c such that 



<c 



N 3S(m+l)/4 i n 2 ( N 8(m+l)/2j 



In min d(Y n ,B) 

n£Jm 



+ 



jY<5(m+l)4 



min ne j m <i(Y n ,B) 
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Combining this with (27), we get, for some other constant c, 




Summation over m proves part (b) of the lemma. □ 

A.4. The case of several saddle points. In this section, we discuss the 
case of several periodic components. First, we assume that each of the do- 
mains U k , k = 1, . . . ,n, contains a single critical point Mk (a maximum or a 
minimum of H). Let Ak, k = 1, . . . ,n, be the saddle points of H , such that 
A k is on the boundary of U k . We denote the boundary of Uk by jk- Let 
Pk = ± / \^H\ dl, where the sign + is taken if Ak is a local minimum for 
H restricted to Uk, and — is taken otherwise. 

The phase space of the limiting process is now a graph G which consists 
of n edges Ik, k = 1, . . . ,n (segments labeled by k), where each segment is 
either [H(M k ) - H(A k ),0] (if M k is a minimum) or [0,H(M k ) - H{A k )\ (if 
Mk is a maximum). All the edges share a common vertex (the origin). Thus, 
a point on the graph can be determined by specifying k (the number of the 
edge) and the coordinate on the edge. We define the mapping h : T 2 — > G as 



We shall use the notation h k for the coordinate on Ik- As in the case of one 
periodic component, we define the limiting process via its generator C. First, 
for each k, we define the differential operator Lkf = cik(hk)f" + bk(hk)f on 
the interior of Ik, where the coefficients a& and bk are given by formulas (3) 
and (4) [where ~f{hk) is defined for each of the periodic components and has 
the same meaning as in the case of one periodic component]. The domain of 
C consists of those functions / G C{G) which 

(a) are twice continuously differentiable in the interior of each of the 
edges; 

(b) have the limits lim^o L k f(h k ) and Kmh k ->(H(M k )-H(A k )) L k f(h k ) at 
the endpoints of each of the edges, the value of the limit q = lim/j fe ^ L k f(hk) 
being the same for all edges; 

(c) have the limits linn^^o f'(hk) and 



follows: 




if xe£ 

if x G U k . 



n 



Vp fc lim f'(h k ) = 2Area(£)q. 



For functions / which satisfy the above three properties, we define Cf = 
Lkf in the interior of each edge, and as the limit of Lkf at the endpoints of 
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As in the case of one periodic component, we have the following theorem. 



Theorem 2. The measure on C([0, oo),G) induced by the process Y t £ = 
h(Xf) converges weakly to the measure induced by the process with the gen- 
erator C with the initial distribution H(Xq). 

The proof of this theorem requires some modifications to the proof of The- 
orem 1. We sketch these modifications without providing all of the technical 
details. 

(I) Recall the definition of the Markov chains and from Section 2. In 
the case of several periodic components, the state spaces for these Markov 
chains will be slightly different. Namely, we replace the curves 7 and 7 
defined in Section 2 by the curves 7 = UfcTfc an d 7 = Ufc7fc; where 7 fc = 
{|-fffc| =e a }. In the proof of Lemma 2.4, U will now stand for the union 
U = Ufc Uk of the loops. Let \i and v be the invariant measures on 7 and 7, 
respectively. Let us study the asymptotics ^(7^) for different k. 

Let ilk be the normalized restriction of the measure \i to 7^, that is, 

fi k (A)=fi(A)/^ k ) 
for each measurable subset A of 7 fe . Instead of (7), we now have 

(77) E lik a = 2^J \VH\dlj Area(C/ fe )e a (l + o(l)) as e -> 0. 



Let us prove that 



(78) fi(%)= j | Vg | dz + ase^O. 

Let r^ fc ) be the time when the process visits 7 fc for the first time. Similarly 
to (15), we obtain 

E u t Area(£) 



K^a Avea(U) 



(79) 
and 

, , E^r^ Area(T 2 - IM 

80 7 t t t \ as e 0. 

V 7 E^o- Area(f/ fc ) 

Let Nk(T) be the number of times the process Xf travels from 7 to 7^ 
before time T and M(T) the number of times the process travels from 7 to 
7 before time T. Note that 

lim = /i(7 fc )(l + o(l)) as e -» 0. 

T^oo M(T) 
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By the Birkhoff ergodic theorem, 
T 

f^o Nk(T) 

and 



lim Ar = (E u r {k) + E Mfc cr)(l + o(l)) as e -» 



Therefore, 



lim /i(7 fc )(E,rW+E Mfc q) i 

Now, (77) and (80) imply that the ratio /i(7fc)/ J Jk |ViT| dl is asymptotically 
independent of k, thus proving (78). 

(II) Let us use (78) to justify (11). Near the origin, we have 

f(h k )=f(0)+ lim f(h k )h k + o(h k ). 

Let r k = 1 if M k is a maximum and r k = — 1 if M& is a minimum. Observe 
that since fi and f are invariant, v(X% G 7 fc ) = /x(7&). Therefore, 



E„ - /(/«)) " £(Lf)(X e s ) ds 



E fc rfc(J 7fc I Vg| g) lim fefc ^o /'(ft fc ) 2gArea(g) 
L|Vff|cfl L|VfT|cfl 



+ o(i: 



Ol £ 



where the expression for E^r was obtained from (77), (78) and (79). 

(Ill) Parts (I) and (II) explain the new gluing conditions. The proof of 
the mixing of and £^ also needs to be modified. Namely, we introduce the 
stopping times r £) fc = min{n : d(X n , B k ) < e 1 ^ 2 }, where B k is the preimage 
of A k on r. Also, let N k {x) = mm{n:d(Y n ,B k ) < y/ne}. Lemma 4.6 needs 
to be modified as follows. 

Lemma 4.6*. For any 5 > 0, there exists some eq > such that for any 

k, 

(81) P,(r Cjfc = iV fc (x) and < T ® for all j + k)> e 1 /^ 6 

for all e < £q, for all x £ T. 

Equation (81) implies, in particular, that 

Px(r (fc) < for all j ^ k) > e 1/&+& . 
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The argument of Section A. 2 now gives 

sup(Var(JT(M!/) " ti*v))) < rc n£l/W , 

(82) 

sup(Var(P 2 n (x,dy) - v(dy))) < rc n£l/b+S 

for some r > and < c < 1, which suffices for the purposes of Section 2. 

To derive (81), we proceed as in the proof of Lemma 4.6. We split N^(x) = 
n\ + ?i2 as in Section 5. Let us consider the more complicated case when 
n\ > 0. The bad times are now defined by the condition 

d(Y n ,Bj) < V^e 1/2 ~ ai for some j. 

The contribution of the bad times is estimated as before, except that now 
we have "very bad times" when 

d(X n ,Bj) < e 1/2 ~ 5 for some j + k 

because then the orbit can be sucked into Uj. Let 773 = [e~( 1 / 2 ~ 2<5 )/( 3 / 2+<5 )]. 
Observe that, with probability 1 — 0(e R ), the time difference between two 
very bad times corresponding to the same j is at least 773. Indeed, the prob- 
ability that both m\ and m<i are very bad is 0(e R ) unless m = 777-2 — m i 
satisfies 

||mp|| < y/me 1 / 2 ' 25 . 

This is impossible for m < 77.3 since m~^ 1+<5 ) < ||m||, due to Lemma 5.1. We 
can estimate P(r^^ > 77,3 for all j 7^ k) from below by a constant since there 
is at most one very bad time for each j before 773 and even if the orbit passes 
near a saddle point, it avoids Uj with positive probability. Using arguments 
similar to those in the proof of Lemma 5.4, the conditional probability of 
having very bad time between 773 and m, given that t^' > 773 for all j 7^ k, 
can be bounded by 

77i e l / 2 ~ & ( sfm 



Thus, we have the following analogue of Part (a) of Lemma 5.4. 
(a*) There is a positive constant k\ such that for all sufficiently small e 
and all x such that n± 7^ 0, we have 

FJ\\e(X ni ) + pn 2 \\ < J^e and r (j) > m for all j ^ k) > , 

Finally, P(r^ > N k {x) for all j + k\r® > n x for all j ^ k) can be bounded 
from below by a constant similarly to ¥(r^' > 773 for all j 7^ k) and thus an 
analogue of Part (b) of Lemma 5.4 also remains valid. 
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The rest of the proof of (81) proceeds as in Section 5. 

Finally, the result remains true if some of the periodic components contain 
more than one critical point. In this case, the edges If. should be replaced 
by subgraphs which contain one common vertex, corresponding to the 
entire ergodic component. The other vertices of Gk correspond to the critical 
points of H inside Uk and the gluing conditions on those vertices are given 
in [7]. 

Acknowledgments. We are grateful to Prof. M. Freidlin for introducing 
us to this problem and for many useful discussions. 
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